Facebook, Instagram, and WhatsApp suffered a six-hour outage yesterday, their longest since 2008. This is what caused it and the deeper chaos that laid beneath the surface.
If you checked your phone last evening and wondered why your WhatsApp text wasn’t delivered, it’s not you or your WiFi; it’s WhatsApp and Facebook. You’re also one of two billion users who were digitally paralyzed. However, besides the delay in texts and social media posts, there’s much more damage in the books than we realize.
Forbes reported that the outage cost Mark Zuckerberg a hefty $5.9 billion in net worth and a 4.8% slice in Facebook’s stock price. More so, Facebook engineers were scrambling as their internal communication systems took a hit, too. Reports suggest that some employees wouldn’t even use their office IDs to gain access into their offices. While so, they couldn’t even efficiently connect on e-mail or log into their systems.
But the big question is, what caused this outage? And why was it of such a behemoth degree? Facebook Inc. gave us a highly cryptic answer as usual, but the internet finally found their answers.
For those of you wondering, this wasn’t a hack or crackdown of any sort. It was just an error. “This disruption to network traffic had a cascading effect on the way our data centres communicate, bringing our services to a halt.” Yes, cryptic and very much like Zuck himself. To explain this in its simplest form — Facebook, Instagram, and WhatsApp couldn’t be found on DNS servers. What are DNS servers? They’re like the internet’s phonebook, with each website having a unique number for a name.
These DNS errors can be tricky to fix. And in this case, it was almost like Facebook didn’t just vanish but was non-existent. This further crippled Instagram and WhatsApp servers as they all rely on Facebook’s ecosystem. However, the origins actually track back to Border Gateway Protocol or BGP. Similarly, if DNS is the phonebook, BGP is your network line, connecting you. When a user enters data on the internet, BGP determines the best available paths that data could travel.
Cloudflare Inc.’s chief technology officer, John Graham-Cumming, stated in a Tweet. ‘Minutes before Facebook’s platforms stopped loading, public records show that a large number of changes were made to Facebook’s BGP routes’. Finally, as systems were back on, Zuck tweeted, “Sorry for the disruption today — I know how much you rely on our services to stay connected with the people you care about.” He also assured no data was breached or leaked.
All images: Courtesy Unsplash