Amazon services 'recovering' as Snapchat and banks among sites hit by outage

Amazon services 'recovering' as Snapchat and banks among sites hit by outage

On Monday, a massive outage at Amazon Web Services (AWS), one of the world’s leading cloud computing providers, caused widespread disruption across the internet, knocking many of the largest websites and apps offline. The incident affected over 1,000 apps and sites, ranging from social media platforms like Snapchat and Reddit to major banks such as Lloyds and Halifax, as well as popular online games and educational tools like Fortnite and Duolingo. According to Downdetector, a platform that monitors outages, user reports of problems skyrocketed to over 6.5 million globally during the morning of the outage, underscoring the immense scale of the disruption.

The problems appeared to begin around 7:00 AM BST, when users worldwide started experiencing difficulties accessing a variety of services. Downdetector reported more than four million user complaints within just a few hours, covering around 500 different sites. This figure was more than double the typical volume of reports the platform usually sees on a regular weekday. The number of reports continued to climb throughout the day, peaking at over six million as more services struggled to recover from the outage.

Amazon Web Services later confirmed that it had addressed the underlying issue, but noted that some services were still experiencing residual problems. By approximately 10:00 PM BST, the company stated that many of the affected services had been restored and that progress was continuing, even as certain disruptions persisted. However, Amazon has not yet provided a detailed explanation of what caused the outage, nor issued a comprehensive official statement.

The company’s brief update on its service status page suggested that the root cause was related to “DNS resolution of the DynamoDB API endpoint” in the US-EAST-1 region. DNS, or Domain Name System, functions as the internet’s phone book, translating human-friendly website names into numerical IP addresses that computers use to locate and communicate with each other. When DNS services fail or are disrupted, web browsers and applications may become unable to find the content or services they need, leading to widespread outages like the one experienced on Monday.

Experts have highlighted the incident as a stark reminder of how intertwined and dependent much of the internet infrastructure has become on a few dominant cloud providers. Professor Alan Woodward from the University of Surrey emphasized that the outage revealed the deep interdependencies within modern digital infrastructure. “So many online services rely upon third parties for their physical infrastructure, and this shows that problems can occur even in the largest of those third-party providers,” he explained. He noted that small errors, often human in origin, can cascade into widespread and significant impacts affecting millions.

Mike Chapple, an information technology professor at Notre Dame University, likened the AWS outage to a large-scale power failure. He explained that during such events, crews working to restore services may inadvertently cause “cascading failures” or multiple flickers of connectivity as they address symptoms without fully resolving the root cause. This analogy helps illustrate why, even after initial fixes, some services continued to experience instability throughout the day.

The outage’s widespread impact has reignited debate about the risks associated with centralizing so much of the internet’s infrastructure within a handful of major cloud service providers. Matthew Prince, CEO of Cloudflare, a company specializing in internet security and infrastructure, told the BBC that while cloud computing offers tremendous benefits—including scalability and efficiency—these benefits come with vulnerabilities. “Everyone has a bad day; today Amazon had a bad day,” Prince remarked, highlighting that a single failure at a dominant provider can ripple across countless services that people and businesses rely on daily.

Cori Crider, head of the Future of Technology Institute, described the outage as akin to a “bridge collapsing,” pointing to the crucial role cloud providers play in the economy. She noted that approximately 70% of cloud computing is dominated by Amazon, Microsoft, and Google, a concentration that she characterized as “unsustainable.” Crider argued that relying heavily on a few monopoly providers poses risks not only to economic stability but also to security and national sovereignty. She advocated for diversifying cloud infrastructure by encouraging the use of more local services and implementing structural separations to bolster market resilience against such shocks.

While much of the focus has been on AWS, some experts suggest that companies themselves share responsibility for the impact of outages. Ken Birman, a computer science professor at Cornell University, pointed out that many businesses using AWS have not sufficiently invested in building robust protection and backup systems for their cloud-based applications. He emphasized that outages are not uncommon

Previous Post Next Post

نموذج الاتصال