Jaruzel's Retro Website

How I Erased 5000+ Facebook Comments and Likes

or, "How to Delete Your Facebook History"


Updated - 16th April 2018

I've now released a copy of the tool I used to delete my Facebook history as open source. You can browse the source, or just download the executable.

Like many people I was appalled at the exposure of peoples Facebook data recently. Although I had stopped using Facebook to post about myself over a year ago, I was still using it to comment on, and react to friends posts and photos.

When I stopped posting my own stuff on Facebook, I wrote a small script to delete every post I had ever made, so I knew that the posts part of my Facebook profile was already purged and clean. What I wanted to do now, was remove all my previous activity.

I did not want to #DeleteFacebook as some of my extended family still use it heavily, and they live far enough away that regular face to face visits are impractical. Also, there are anecdotal rumours that asking Facebook to delete your profile merely tombstones that account, and the data is never properly purged from their systems (anonymised or not).

I know I cannot close the barn door on any data of mine that's already out in the wild, but I can control any further scrapes of my Facebook data by manually removing as much of my Facebook Activity as I can. Unfortunately, and not unexpectedly, Facebook do not give you a simple way to do this.

There are several browser extensions that are available to do what I am attempting, but as a hobbyist coder, it's always more fun to explore how to do these things yourself...

So, where to start? Having already purged all my Posts from Facebook a year ago, I knew that the main Facebook site is such a mess of dynamic html that trying to scrape it via that would frustrate me pretty quickly. My tools of the trade are typically .NET framework based, and I am also not an expert at Javascript, so I fired up Visual Studio, created a new empty project, and dropped a standard WebBrowser control on a blank form.

This didn't need to be good or clever code, it just needed to work...

Having already identified that for me at least, scraping www.facebook.com , would just be an exercise in frustration, I decided to use the very basic version of Facebook:

https://mbasic.facebook.com

This version is very light on Javascript, and the rendered html is served from the server, and not generated by the browser. This makes exploring the document html tags much easier. In the basic version of Facebook, the Activity log is stored here:

https://mbasic.facebook.com/<your-user-id>/allactivity

On this page you can see all your recent activity, but more importantly, you can see links to the years and months. It's these links we'll be using to walk the activity history. I created some code that told the WebBrowser control to navigate there and tell me when it's done.

Facebook Activity Log Example

The first step was to collect only the top-level links of the root page of the Activity Log that linked to the next layer of activity. Helpfully these links follow a standard format and always contain:

/allactivity?timeend=

Once we've got a list of those, we navigate to each one. Again, we're only interested in any links that contain the above string. Any new links we've found (that we've not seen before) we add to the list of pages that we are interested in. In the basic version of Facebook, Activity links that are labelled with only a year , tend to route to a similar page with the year broken down into further month links. If the code walks these two levels correctly, you should have a list of URLs that contains 12 month links for every year your Facebook account has existed.

At this point we can start hunting down Comments, Reactions and Likes. The Delete button for these can be easily identified by the following string matches in the links in each page:

/allactivity/removecontent/
/allactivity/delete/

We use our previous list of URLs we collected, and direct the WebBrowser control to navigate to each one. Once there, we parse the all the links in the page looking for the above string matches. If a link contains one of these strings, that link is added to a list of URLs that we call the DeleteList. Once all the year/month URLs have been navigated, we should have all our basic Activity collected for deletion.

Now we tell the WebBrowser control to navigate to each of the URLs in the DeleteList. Each URL, because it sat behind the Delete button for an activity, causes Facebook to delete that specified Activity.

Additionally, I added a check for a link that had the text load more in it, as the basic version of Facebook doesn't show more than a few Activities per page. A busy month would result in several pages hidden behind nested load more links.

I deliberately added a one second throttle into the WebBrowser loop code, so that the automated navigation of the URLs remained stable, as previous scraping work like this showed that the WebBrowser control cannot keep up with fast code loops and page data can be truncated or not loaded. Also, I didn't want to trigger any anti-scraping detection that Facebook may have. They probably don't, but it's better to be safe than sorry.

Having already run through my code without actually navigating to the Delete pages, I was fairly sure that my code did what I needed to do. So I let it rip, and kissed goodbye to all my activity.

Or So I Thought...

Facebook Delete Script in Action

After some time (more than an hour, less than three). I rechecked my Activity Log via mbasic.facebook.com and all seemed nice and empty. Hooray, success! To be absolutely sure, I switched to the full version of Facebook, and checked my Activity Log there. There were still hundreds of comments, likes, and reactions! Something, somewhere had gone wrong...

I could continue to bore you with all the investigatory steps I took to track down what was going on, but I won't, as you are probably bored already. It turns out that the basic version of your Activity Log does not surface all your activity. To see all your comments, or all your reactions, you need to set a filter by clicking on the big filter button and picking an activity type from the list. You know when you've done it as the top of the log shows the filter type:

Facebook Activity Log Filter

In order to dig up every comment or reaction I ever made, I needed to repeat everything I'd done, for each Activity type I wanted to erase. For me, as I wasn't a heavy user of most of Facebooks features, this boiled down to just these:

Yes, Facebook makes a note of every video you clicked on in your newsfeed. I hadn't realised this and stumbled over it as I was checking each Activity type. Personally, I think this meta-data is used heavily by the algorithms that analyses your profile for advertising. For once, Facebook does the honourable thing here, and there is a delete video history button, other activities don't seem to have this option.

Anyway, once I'd re-run the code against each activity type, my activity feed is as clean as it can be. There are some activities that can't be deleted, such as changes to your profile information, but it's not much compared to what you can delete.

A couple of things I learnt on this journey:

I hope the breakdown of my attempt to erase my Facebook history without deleting my Facebook account, has been an interesting read for you. If you want to ask questions, or simply throw abuse at me (please don't...), I can be found on Twitter.

See Also: Hacker News Comment Thread

- Jaruzel, March 2018