Fastly CDN Outage Caused By Customer Misconfiguration and Software Bug

Popular websites and apps around the world were knocked offline because of an widespread internet outage with the Fastly content delivery network (CDN). Users were unable to access the sites and apps or saw an “Error 503” message.

Websites operated by news outlets including the Financial Times, CNN, The Daily Mail, Al Jazeera, CBS, New York Times, The Guardian, Financial Times, and Bloomberg News were included in the blackout.

Even government websites from the UK and the US White House page were down and were inaccessible during the outage.

Social media platforms and streaming platforms such as Reddit, Twitch, Pinterest, HBO Max, Hulu, Spotify, and other services like Paypal and Amazon were also affected.

Even government websites from the UK and the US White House page were down and were inaccessible during the outage.

Fastly confirmed that they had issues with its global CDN network just before 1000 GMT and said they are working to fix the problem.

An hour later, Fastly claimed they managed to fix the problem while some websites have started working again.

“The issue has been identified and a fix has been applied. Customers may experience increased origin load as global services return,” said Fastly on their Twitter account.

Fastly has not yet specified what caused the outage which bore similarities to the recent Amazon Web Services (AWS) blackout.

Fastly explained that the outage was caused by a customer and not an internal failure as previously suspected.

“We experienced a global outage due to an undiscovered software bug that surfaced on June 8 when it was triggered by a valid customer configuration change,” said Nick Rockwell, Fastly’s senior vice president of engineering and infrastructure.

The configuration change eventually triggered a bug that caused 85% of their network to fail.

Fastly will be reviewing its infrastructure to prevent a similar incident from happening again.

“We have been — and will continue to — innovate and invest in fundamental changes to the safety of our underlying platforms. Broadly, this means fully leveraging the isolation capabilities of WebAssembly and Compute@Edge to build greater resiliency from the ground up. We’ll continue to update our community as we make progress toward this goal,” said Rockwell.

© Fourth Estate® — All Rights Reserved.
This material may not be published, broadcast, rewritten or redistributed.