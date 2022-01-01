



Closing cloud services is nothing new. However, the transition to telecommuting in 2020 was exposed to many vulnerabilities as carriers, cables, fiber companies, and all popular apps in the sun experienced a temporary and catastrophic collapse. .. This puts an unprecedented burden on cloud infrastructure systems that support your favorite streaming and productivity sites. These outages were an unavoidable result.

You would have hoped that 2021 would show a significant improvement. Instead, the internet has proved to be a deck of cards that is ready to collapse if the wrong basic part is folded. Many sites, whether frugal or poorly planned, put all their data and traffic eggs in one cloud basket. If you anticipate a more appropriate emergency for these sites, a failure on one node can remove some of the sites with the highest traffic.

This year, we’ve seen your favorite messaging apps, smart homes, gaming networks, productivity suites, and social media sites collapse somewhere. Beyond that, the Amazon Web Services (AWS) and Facebook outages have proved how much our daily lives depend on the cloud, from smart home technology to package delivery.

Looking back at the worst outages in 2021, we can expect things to improve in 2022. But unless cloud infrastructure companies and content delivery networks (CDNs) change the way things are done, and companies don’t start adding offline capabilities, there’s no reason to think the situation will improve. For technologies that depend on the cloud.

1. AWS outage will stop shipping, cameras and cat food boxes

The recent December AWS outage is still fresh in my head. Amazon Web Services is said to run about 33% of cloud infrastructure services, so it’s possible that they were using about one-third of cloud services when AWS collapsed on December 7. there is.

According to the AWS team, AWS internal networks for monitoring, internal DNS, and authentication services “cause a massive surge in connectivity activity that overwhelms network devices between the internal network and the main AWS network, and the communication between them. Network. ”This internal network is linked to a global AWS server, so it takes about 7 hours for developers to fix the internal network, delaying traffic or shutting down the site completely. Has occurred internationally.

During holiday shopping, the Amazon delivery driver app with route and address went down and I couldn’t complete the delivery. Also, consumers were unable to place new Amazon orders. In short, the company missed almost a day’s revenue. First-party Amazon services Alexa, Ring Camera, Prime Video, and Music are all down. This meant that smart video doorbells and baby monitors were temporarily worthless. Also, popular third-party apps such as Disney +, Venmo, and iRobot all broke down thanks to choosing a cloud provider.

According to CNBC, some exam services were cloud-dependent, which had the impact of an AWS outage and interrupted the university’s final exams. Even some “smart” automated cat feeders have stopped feeding cats that day.

Following this outage, Android Central readers said they were more vigilant about cloud-dependent smart home technologies. Experts believe that Amazon needs to incorporate offline controls into smart home technology, but that’s unlikely. Again, this is because the cloud can sell cheap, power-hungry technologies that wouldn’t be possible without the cloud.

2. Metaverse collapses

If you’re talking about the most annoying outage in 2021, you need to mention Facebook. Just before the meta name changed, Facebook accidentally shut down its cloud service because of “a change in the configuration of the backbone router that regulates network traffic between data centers.” This caused all online services to be cascaded and stopped. This has made it impossible for anyone, including employees, to access metaservices around the world.

Meta’s cloud servers only power their businesses such as Facebook, Instagram and Whatsapp, but the outage is still spilling over and hurting other companies. Sites that rely on Facebook logins are no longer accessible to users, and other shopping sites and games that rely on Meta’s servers and tokens have also shut down.

And, of course, this Facebook outage has weakened cloud-based peripherals. Quest 2 owners no longer have access to the game’s library due to Facebook account requirements, but Ray-Ban Story’s smart glasses have lost their smarts. Facebook commented that it would need to add offline support for the technology in the future.

Above all, the 6-hour Whatsapp outage was the worst blunder for the company. For millions of people who use the app as their primary way to communicate with their families, a day without the app was too much. After the outage, Telegram is reported to have gained 70 million new members. That doesn’t necessarily mean Whatsapp has lost that many users, but it definitely saw a serious escape that it might never be regained.

Whatsapp, Facebook and Instagram had a similar outage in April 2021, but it lasted only 45 minutes.

3. Stop the internet quickly

When something goes well, you don’t pay attention to it. As a result, many hadn’t heard of Fastly’s content delivery network (CDN) until it broke in June and dragged down some of its most popular websites.

CDNs help cache content to reduce hosting server load times and reduce bandwidth load. As a result, many companies rely on CDNs. Deliver data fast around the world and transfer it to different parts of the world, reducing load times regardless of where your users live.

But in the case of Fastly, misconfigured services “caused confusion across POPs around the world,” damaging sites that depended on edge computing. Specifically, sites such as Amazon, Twitter, Reddit, Google, CNN, Guardian, and The New York Times all visited at once in early June. Within 49 minutes, we quickly restored “95%” of the service, which was a widespread but relatively short-term outage compared to other services.

4. Troublesome PS 4 PSN outages occurred in 5 years

If you could buy a PS5 this year, at some point in 2021, you may have had problems accessing your library or playing multiplayer games. Sony and CDN Akamai Technologies have dealt with several outages throughout the year.

The worst and longest PSN outage occurred from late February to early March, confirming that some PS5 and PS4 players sporadically lost access to the game library for several days.

Still, three more outages in the months that followed showed that Sony had a basic network problem to solve. In either case, players around the world will receive a maintenance error message when accessing the online service and the outage lasts 1-5 hours.

Among the best PS5 games, many games either require an online connection at all times or revolve around multiplayer. If Sony can’t keep the PSN service working for a few days at a time in 2022, it must make its loyal fans unhappy.

5. Google can’t help smart home customers

The first major outage in 2021 occurred in February, thanks to the sudden memory loss of the Google Assistant. When I try to ask a Nest or Google Home speaker, I get “The device has not been set up yet”, despite the opposite evidence. As a result, from smart lights to Nest security technology, we couldn’t connect to the Google Home device associated with your account. In addition, the Google Assistant’s Android app had a problem answering the question.

This seemed to affect all Google Home users that night, and they turned to Reddit and the support forums for help. Google fixed the problem that night, hours after the problem became widely known, but it’s not clear exactly when it started.

6. Wink smart home winks out

Most of the worst outages in 2021 affected various sites in a relatively short period of time. However, the award for the really worst outage of the year will be given to Wink Hub, which has been shut down for 10 days. Due to the new reliance on working cloud services, these hubs have no control over Zigbee or Z-Wave products and are of little value.

Wink offered a 25% discount on subscription costs as an apology, but never really explained the cause of the problem, just saying “optimize because the Wink backend and API have been backed up.” rice field. Many customers saw this outage as a sign that it was time to abandon the wink forever.

7. Android contact notification system works

When it comes to contact tracing and prevention of COVID-19 exposure, delays in knowing your condition can spread further and lead to illness. So if the NHS COVID-19 app went wrong due to a problem with the Android contact notification system on Google’s backend, it didn’t look good to Google.

Those who want to see their status have found an indefinite “loading” screen. Google has announced that after about 12 hours of bug reporting, it will investigate the issue and take another 5-6 hours to resolve the bug. Adding a spooky “phantom notification” glitch from 2020 pops up a false notification that a user has been exposed to COVID-19 and disappears before tapping, why people don’t trust the app at that point There were many.

8. AWS outage redundancy

Following a major AWS outage on December 7, a second AWS outage occurred on December 15 due to issues with Amazon’s Oregon and Amazon Web Services facilities in Northern California. This time around, we’ve covered Twitch, DoorDash, Xbox Live, PSN, Ring, Disney +, and T-Mobile.

Then, on December 22, there was a third AWS outage, shutting down Fortnite, Hulu, Quora, Slack, and Imgur. In this case, a power outage at a facility on the east coast caused problems. In other words, there were three outages in three weeks. The latter two outages lasted only about an hour, which is certainly long enough to cause problems.

Will the outage problem diminish or expand in 2022?

These various events highlight how vulnerable today’s cloud-dependent systems are. Because much of the internet usage is concentrated on some apps and services, most of them use some major cloud infrastructure providers, resulting in a single crisis that reduces productivity or Expensive technology can become useless.

So can we expect fewer accidents next year?

To reduce outages, you need to invest more in your cloud infrastructure. While recent infrastructure bills have allocated billions of dollars to improve high-speed local broadband access and civilian cybersecurity, most of 2021’s worst outages aren’t hostile actors. , It was due to the company’s fault. As a result, companies may need to expect (or pressure) to invest more in the cloud infrastructure itself.

As it stands, Gartner predicts that companies will spend $ 482 billion on cloud services in 2022, an increase of 21.7%. It should at least be a step in the right direction.

It’s important to note that many of the worst outages are due to your company’s internal monitoring network or a third-party CDN, not the main server. The system itself, which aims to monitor and prevent outages, can bring the entire system down in the wrong situation where human error can have disproportionate results. A CDN is essential to provide the fastest traffic possible, but it adds another potential step where something might go wrong.

If a single node, server, or data center can overwhelm your system, you can invest as much as you like. To reduce major outages in 2022, companies need to better structure their data. This allows you to quickly start a backup until the problem node is fixed. It’s much better than it was two years ago, but there’s a long way to go before the outage doesn’t last.

