Update: For Amazon’s explanation of what happened, go here.
On Tuesday, users of Amazon Web Services started reporting issues with the cloud platform, which the company acknowledged at around 11:20am PST.
The problems originated with the US-East-1 region, hosted in data centers in Northern Virginia. Thirty-three of AWS’s own services were impacted, including nine services which suffered complete disruption: Athena, EMR, Inspector, Kinesis Firehose, Simple Email Service, S3, WorkMail, Auto Scaling and CloudFormation. This caused a chain reaction, taking countless cloud-based applications and websites offline.
The company initially said: “We’ve identified the issue as high error rates with S3 in US-EAST-1, which is also impacting applications and services dependent on S3. We are actively working on remediating the issue.”
Amazon reported that it had fixed the initial outage by 1:49pm PST, but performance issues for some services persisted until 6pm.
Outage outrage
According to AWS, the outage was caused by S3 – the Simple Storage Service, one of the oldest and most popular components of the AWS platform that serves as the foundation for thousands of cloud-based products.
S3 relies on the same infrastructure used by Amazon to host its global network of websites, and enables customers to store and retrieve any amount of data at any time (unplanned outages excluded).
The actual cause of the downtime is not yet known, and DCD will update this article as we hear more, but Twitter is filled with users and organizations that were impacted by the sudden loss of service. Those affected included Expedia, Coursera, Quora, Trello, Slack and Business Insider.
During the outage, some users noticed that the AWS dashboard didn’t seem to reflect the ongoing issues. Turns out that the dashboard, like much of the Internet, is powered by the S3 service and could not function as intended.
Even DownDetector, a popular website used to detect outages, went down with the S3 service.
We had previously embedded some tweets in this story but it appears that the mechanism was broken when AWS started experiencing issues.
Here’s a selection of non-embedded tweets:
AWS S3 buckets are down. That’s like half the internet. #Amazon — embarrassed American (@kareemery) February 28, 2017
As is tradition, the AWS outage appeared in my Twitter feed before it appeared on the AWS status page. — Jamie Gaskins (@jamie_gaskins) February 28, 2017
We are currently experiencing a downtime because of Amazon #AWS #S3 infrastructure issues. We’ll keep you updated and will be back up soon. — SignEasy (@getsigneasy) February 28, 2017
We are experiencing issues with one of our infrastructure providers: AWS S3 is offline. Uploads and analyses are temporarily unavailable. — One Codex (@OneCodex) February 28, 2017
[update] AdStage customers: the platform is down due to an Amazon AWS server outage. We’re praying to the Amazon gods for a fast recovery 🙏 — AdStage (@adstage) February 28, 2017