Facebook Origins – An OSINT deep dive on account creation
**Since I published this last month I have gone back and discovered 15 digit ID #'s in November 2009 so I have updated part of this blog accordingly**
One thing I love about OSINT is sometimes its highly technical involving web APIs and python coding. Other times it’s all about good old fashioned research. This example is the latter (although driven by the former).
Earlier this year, I needed to investigate some Facebook accounts for a case I was working. Somebody was impersonating somebody else online so account creation date was my target data. I noticed that some accounts have Facebook username associated and some accounts only have a Facebook ID number.
To clarify the way Facebook account creation works ALL Facebook accounts have an ID number that is assigned at time of registration, but you only get a username if you register one. As described here:
Consider an anonymous account intended for mischief or a sock puppet account for research. Those accounts are likely going to be registered with a throwaway email account and unless the user is going to keep it as a long term alias, the username probably won’t ever be registered. It will always have a Facebook ID # so that is what my focus is on.
Profile and ID numbers
Mark’s username is ‘Zuck’ and his page can be found at www.facebook.com/zuck
Mark’s Facebook ID # is 4. If you navigate a browser to https://www.facebook.com/profile.php?id=4 you will end up on the same page.
Mark Launched Facebook in February 2004 and his user ID is #4.
I joined Facebook in 2006 and my user ID in in the 500,000,000 realm.
A buddy of mine joined kind of late in the game in 2012 and his user ID is in the 10004000000000 realm.
Facebook has only been around since 2004, so if I find a person that registered each month since it was created I can make a guide giving an estimate of creation date. Right?
No, not quite. Why?
Facebook History Lesson
The history of the site is fairly interesting to research and it sheds some light on the user ID structure for me as well.
After a little research we can see a timeline take shape:
February 2004 – Facebook launches for Harvard students only
March 2004 – Facebook expands to Columbia, Stanford, Yale, MIT, Boston College, Boston University, Northeastern University and Dartmouth
College expansions continue and some more notable timeline points are:
September 2005 - High School versions are added
October 2005 - 21 universities in the UK and worldwide are added
September 2006 - Open registration for anyone age 13 and up with an email address
What does it mean?
The research takes a slightly technical turn again at this point. As I pressed the idea of coming up with a timeline of plotted account creations I find this article on Quora:
So we see that even though Zuckerberg’s account is ID #4 at Harvard, the timeline of included schools were segmented by SQL groupings. So the first Columbia students may have been assigned ID 100001, the first Stanford student 200001 and so on. The plotted timeline doesn’t work in this case because the different school groupings could overlap in a non-sequential manner.
So my timeline isn’t completely out the window but a simple plotted graph solution not going to work. I can’t be the only person to have thought of this. Right?
Nope. Datascience.com article by Massoud Seifi did some great analysis with the Facebook ID’s before and after the 64bit ID transition mentioned at the end of the Quora article above.
Massoud Says in the article:
What Massoud has done is prove the difficulty I was going to see in plotting data points from the start to finish across the history of Facebook. His graph on the right could very well be a cluster of Stanford, Yale and Columbia students with no discernable correlation.
The data set he used for this graph is here: http://metadatascience.com/2013/03/14/lookup-table-for-inferring-facebook-account-creation-date-from-facebook-user-id/
My own research
I started some of my own data collection prior to finding Massoud’s work and noticed an anomaly in the data around December 2009 which ended up being the transition point of the 64bit UID numbering.
Facebook User ID Account Creation Date
What we see in December 2009, between the 8th and the 12th, the Facebook ID numbers changed from a 10 digit format to a 15 digit format. Based on the dates having overlap in the month of December 2009 with both 10 and 15 digit numbers we can’t determine a clean transition date. We can make one major assumption from this data. An account with a 15 digit ID number can likely NOT be older than November 2009
As luck would have it the day after I decided to start plotting out this data, my wife was showing me a video she got from Facebook that showed her when she joined. It’s called Faceversary and it is a wonderful source of OSINT!
Run a search on Tagboard for the term ‘faceversary’ and you have an easy way to start cataloging account creation dates.
If you click on the people names it will take you to their Facebook page and you can see the actual #faceversary video. Around the 31 second mark of each video we get this lovely moment:
Previously we had to research a Facebook user’s first post or first profile photo for an approximation of when they created their account. My own page is a prime example of why this is only a guess since I joined in 2006, but my first post is in 2007. So while there are traditional OSINT methods to research we now have an easy out. With so many people publicly posting Faceversary videos it shouldn’t take much for a keen investigator to use Facebook IDs to determine very accurate account creation windows.
My research here isn’t over. With a little OSINT work I was able to determine that the Yale facebook ID blocks are likely 300000-399999. Remember the format:
With enough research I’m confident I can determine more of the school groupings. Then I can run analysis on the data in each school's SQL group separately to drill in the data prior to the November 2009 transition.
I will update the blog as I go.
**Since I started researching this I have done manual OSINT and estimate the following SQL groupings by school:
Columbia ID block 100000-199999
Stanford ID block 200000-299999
Yale ID block 300000-399999
Cornell ID block 400000-499999
Dartmouth ID block 500000-599999
NYU ID block 800000-899999
My focus has shifted on account data post 2009 so I'm putting the school group research on hold temporarily.
If you have any techniques to share or comments please drop me a line on Twitter @baywolf88