#python

Deleting all your Tweets with Python


Twitter doesn't offer an option to delete all the content you've posted on there, and that's no surprise. They're a social media business after all; their importance comes from the content that exists on it. It's not good news for them to see content go away.

But thankfully, they do provide you with an API to interact with your data, which includes delete functionality. So while you can't perform bulk deletion from the twitter interface (whether web, desktop or mobile), you can achieve it by executing some lines of code that make use of the Twitter API.

 

twitter_bird


Who is this post for

If you just don't want to be on Twitter anymore for whatever reason, and want to erase every proof of it, you can simply go to the settings and deactivate your account. In 30 days, your account and all the data associated with it will have been removed. This post is not for you.
However, if you're like me and see value in using Twitter, and still want to keep your account and the community you've built on there, keep reading. This post explains how to delete the public content you've posted there, without affecting your list of followers and following, your bookmarks, lists, direct messages or fave..I mean likes. To put it simply, we'll just be deleting tweets, which includes replies, quotes and retweets.
Another thing I'd like to note here is that my target audience for this post are developers, or let's say tech-savvy people in general. You wont have to write any code yourself, but you do need to be somewhat familiar with the concepts of a programming language, package manager, third-party package, API, etc. Now I could have made this a beginner friendly guide, but then I'd have to explain some basic things, which would increase the length of this post more than I'd like. That being said, I do believe that, given more time, even someone who's never touched code in their life could get it to work. I know that a need to delete tweets could arise for anyone, therefore if you're one of those people who are struggling with the process, drop a comment down below or send me an email, and I'd be happy to share some helpful resources or tips.


Why delete all your tweets

If you're a professional, especially a software professional, I think it's very important to have on online presence, other than just having a GitHub profile. I'm talking a platform where you post opinions on matters, even outside your area of expertise (we're more than our job, right?). When people search your name, your brand must pop up. You can do that via Twitter, LinkedIn, Medium, personal blog, or whatever combination of these.
But just like having an online presence can help you boost your professional network, it can also backfire. If the first thing that shows up when someone googles your name, is a tweet of you passionately arguing politics with a stranger, that's not a good first impression. You could totally be on the right side of the argument even, but it doesn't matter. It's not how you should introduce yourself to the world.
Things get worse if you've posted some remark that could be taken out of context, because the other account(s) have been deactivated and the conversation isn't fully available. Or if it's some inside joke you replied to a friend. We've entered the age of online accountability. You could be held responsible for something you said on social media, just like you would in real life (usually more, because people are angrier on Twitter). Now you can argue all day and night whether that's a good change or not, but this post isn't about politics. Things are the way they are, and starting off with a clean slate of tweets would be a safe and practical bet for anyone, especially if you've been active on the platform since you were a 13 year old, like I have.


Step by step guide

Disclaimer: following the instructions below will result in deleting all your tweets, which may include old photos or perhaps precious memories to you. The action is not reversible. Please refer to the section above Who is this post for for more details on that, before proceeding.
First of all we have to sign up as a user of the Twitter API. It comes with a free plan which should be good enough for our intents and purposes. At the time of writing this, to apply for a developer account, you must go to Settings and Privacy / Additional Resources / Developers. The process of applying is  quite straightforward, so I wont cover it here. Just follow all the steps that are displayed on the screen.
After your application is submitted, it can take anything from a few minutes to a couple of days, before you get an email with the response. If your application was approved, you should be able to navigate to the developer account interface (same place where you applied earlier). Here you have to collect some credentials which our script needs in order to authenticate with the Twitter API. These are: consumer key, consumer token, access token and access token secret. Bear in mind that these credentials are sensitive data; whoever has them, can authenticate as you and perform all sorts of actions on your behalf. So don't share them around.
Alright, now that we have the credentials to interact with the API, we don't need anything else from the developer account interface. We can close it and proceed with the code. In the following link you can find the short script I wrote. You don't need to add anything to it; just replace some values that are specific to you (more on that later). As you can see, it's just a few lines of code and some simple operations.

I went with Python which is my scripting language of choice, and easy to use in general, but you can really do this with any general-purpose programming language out there. Though I must stress that tweepy, which is a third party library, helps a lot by abstracting away some of the needless complexity and confusion the Twitter documentations brings about. Between different versions of the API, half a dozen authentication methods with different access levels, ambiguous guidelines and poor navigation, it's a bit of a mess. I struggled for a while before deciding to just use a library instead of interacting with it directly.
Make sure python is installed by hitting python --version. If that doesn't work, try with python3 instead. If neither of the 2 works, you'll have to install python. After that, hit a pip install tweepy to install the third-party package. Now open the script and replace in it the four values of the credentials, with your own values. The last thing you need to do is replace the NUMBER_OF_TWEETS constant with your own  number. You can get this value by going to your twitter account (not developer account) profile. As you scroll down on your own tweets, up top should appear the total number of tweets you have posted.
The rest of the script calculates how many fetch requests it needs to make, depending on your NUMBER_OF_TWEETS and REQUESTS_PER_PAGE. Leave the latter to the max setting for faster execution times. On each iteration, that is each batch of tweets that it receives, it will loop over and send a delete request for every tweet.

That's all there is to it. Hit python my_script.py where my_script is how you've named your python file, and in a few seconds or minutes, depending on how many tweets you have, it should do the trick. Open Twitter and you can verify that it worked.


Optimizations

There's obviously a couple of things that could be improved about the script, but I didn't see much value in doing them, as this isn't some state-of-the-art production code. It's just an one-off utility script that gets its job done. I'm going to list these things here anyway, for anyone wanting to go a step further with it.
First off, we're manually getting the total number of tweets, and placing it as a constant in the script. I'm sure the API must have an endpoint that returns this number, so it'd be better to automate this step. But I didn't find anything at a quick glance, and I didn't have time to dive into the complete API reference.
Another potential problem, if the number of tweets is too big, is the fact that the script is entirely synchronous. It would take a long time if you have over 100K tweets let's say, which isn't that uncommon. Ideally, you'd launch multiple requests in an async fashion, but in that case you have to consider the issue of pagination. By default, the user_timeline() function gets the most recent tweets, so all your requests would be operating on the same batch of tweets. Not only would the script operate on just a tiny fraction of all your tweets, but it wouldn't even delete that first batch entirely either, because at some point one process would try to delete a tweet that some other async process just deleted, resulting in an error and halting the execution of the script. So if you do go down that path, you'd need to paginate the fetch result appropriately, perhaps based on the date of the tweets or some other way that tweepy suggests.
Last optimization that comes to my mind is the delete requests. Even if you do launch all the get requests asynchronously, each of them will have to make 200 synchronous delete requests, which is quite an I/O bottleneck. I'd research tweepy or the Twitter API reference to see if there is a way to perform bulk deletion with a single delete request. If not, the other option is to make this part of the code async as well.
As I mentioned above, these are good ideas if you want to play around with it and have some fun, but I left them out because I don't see much return of investment. It's highly unlikely someone would run this script on a regular basis. Therefore, the possibility that it might take up to half an hour or even more for big workloads isn't that important.