Many of our clients have migrated to the cloud and one of their first reasons for doing that is reliability and security. But nothing is perfect and you still need a solid recovery strategy regardless. The recent Amazon outage reinforces that strategy.
Amazon Web Services is by far the world’s largest provider of internet-based cloud-computing services and this week, Amazon Web Services, experienced a widespread outage in its eastern U.S. region, causing unprecedented and major problems for thousands of websites and apps.
Amazon is the largest provider of cloud computing services in the U.S. Beginning around midday on the East Coast, one region of its “S3” service based in Virginia began to experience what Amazon, on its service site, called “increased error rates.”
While few services went down completely, thousands, if not tens of thousands, of companies had trouble with features ranging from file sharing to webfeeds to loading any type of data from Amazon’s “simple storage service,” known as S3.
Amazon S3 stores files and data for companies on remote servers. Amazon started offering it in 2006, and it’s used for everything from building websites and apps to storing images, customer data and commercial transactions.
By late afternoon Amazon issued a statement that it was still experiencing “high error rates” that were “impacting various AWS services.” “We are working hard at repairing S3, believe we understand root cause, and are working on implementing what we believe will remediate the issue,” the company said.
Amazon’s Simple Storage Service, or S3, stores files and data for companies on remote servers. It’s used for everything from building websites and apps to storing images, customer data and customer transactions.
The problem affected both “front-end” operations — meaning the websites and apps that users see — and back-end data processing that takes place out of sight. Some smaller online services, such as Trello, Scribd and IFTTT, appeared to be down for a while, although all have since recovered.
The corporate message service Slack, by contrast, stayed up, although it reported “degraded service ” for some features. Users reported that file sharing in particular appeared to freeze up.
The Associated Press’ own photos, webfeeds and other online services were also affected.
Major cloud-computing outages happen periodically. In 2015, Amazon’s DynamoDB service, a cloud-based database, had problems that affected companies like Netflix and Medium. But usually providers have workarounds that can get things working again quickly.
What is your cloud recovery strategy?