Facebook faulty configuration change: Who, what, why?

Facebook down - faulty configuration change: Who, what, why?

Update: Facebook just posted another update with just a bit more details.

After Facebook’s downtime of around 6 hours which started just before noon on Monday, October 4, 2021, and several hours after restoration, Facebook has finally posted an official explanation for the outage via their engineering blog. The reason for the blackout cited was a “faulty configuration change.”

Although less than 24 hours have passed, this is still troubling because we’ve seen much more transparent explanations published following major outages at this stage from Cloudflare, Dyn’s major DDoS, and others.

For example, have a look at Cloudflare’s detailed explanation of Facebook’s outage from their vantage point connecting to Facebook. In the article, Cloudflare reported that “At 15:58 UTC, we noticed that Facebook had stopped announcing the routes to their DNS prefixes.”

 

Who, what, why?

Facebook said overnight that their “engineering teams have learned that configuration changes on the backbone routers that coordinate network traffic between our data centers caused issues that interrupted this communication.” and that “the root cause of this outage was a faulty configuration change.” They also added that they “have no evidence that user data was compromised as a result of this downtime.”

From a company already plagued with transparency complaints, this blog post leaves us with not much information than we already knew. We still ask Facebook who, what and most importantly why did this outage happen?

 

Facebook and transparency in the same sentence

The list of criticisms of Facebook for their collection and handling of data, user privacy, transparency, and a slew of other complaints is long. Coincidentally, or not, just the night before this outage, Frances Haugen filed a whistleblower complaint with the Securities and Exchange Commission (SEC), alleging that Facebook has misled investors “to prioritize growth over safety” by ignoring research showing that it amplifies “angry, polarizing, [and] divisive content” to the purported detriment of public safety, for profit. She appeared on 60 Minutes on October 3, 2021, with claims that “Facebook chooses Profits Over Safety.”

What’s also notable is that Haugen revealed how she secretly copied tens of thousands of pages of Facebook internal research to provide evidence showing that the company is lying to the public about making meaningful progress against hate, violence, and misinformation.

With Facebook’s reputation to mind and the recent release of internal research documents by Haugen, it makes one ask: Is it possible for Facebook to be fully transparent on the cause of yesterday’s outage?

Well, it’s undoubtedly an excellent opportunity for them to do so! Especially if there’s nothing dubious about yesterday’s outage.

 

Conclusion

At present, we know little more than we did the day before. One could argue that these things take time to investigate to provide details. That Facebook needs more time to make sure that if and when they provide a more detailed report, that they get it right the first time.

However, let’s compare Cloudflare’s first blog post after a similar outage in 2019, where they said: “This is a short placeholder blog and will be replaced with a full post-mortem and disclosure of what happened today.” On the contrary, Facebook left no indication of understanding their responsibility to disclose as much as possible about the details of this outage publicly. Instead, they end the blog post with: “…we’re working to understand more about what happened today so we can continue to make our infrastructure more resilient.”

Let’s hope as they work to understand more about what happened, that they will share such information. This outage affected billions of users who have uploaded their photos, conversations and other personal data to Facebook’s family of platforms. And so, we wait for the answers to who exactly, what happened and why did this happen?

Tags: ,



Top ↑