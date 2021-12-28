



IT is synonymous with running a business for almost any company of any size. Therefore, if technology goes down, the company can go down with it.

IT failures, whether complex systems or projects, are growing rapidly at the top of the business news section, and their impact can be even more detrimental and embarrassing.

Gather eight of 2021’s biggest technological crises to spotlight nearly catastrophic IT problems that can not only occur, but can also have a significant impact on your business. Beyond Schadenfreude, we hope these IT disaster stories will teach you lessons, even if your organization isn’t that big or your stakes aren’t as high as some of the protagonists in these stories.

Why you need to design a better UI (and don’t offend creditors)

Many companies tend to take an attitude towards IT tools, “If it’s not broken, don’t fix it.” And if you’ve ever participated in an upgrade or rollout failure, you know why. However, as a result, systems used in production environments can become truly obsolete, with UIs dating back to the early days of the software industry. This can mean usability issues with real consequences.

One of Citibank’s back-end systems is a good example of this trend and one of the main causes of the $ 500 million turmoil. The story looks like this: Citibank was trying to send $ 7.8 million in interest to several Revlon creditors on behalf of one of its clients, Revlon. Doing that with Flexcube, an older part of Citibank’s in-house software, was a particularly tedious process. Citibank employees had to set up transactions as if they were paying off the entire loan and then check multiple to be able to calculate interest correctly. A box for transferring most of your payments to Citibank’s internal account, where only the interest portion is paid to your credit. Despite the fact that three different people approved this transaction for Revlon, it passed without checking all the appropriate boxes and was $ 900 million (mostly by creditors until 2023). Was not) was sent.

It may be surprising that this kind of mistake is not unheard of, and that the profitable party usually returns the wrongly sent money to the wrong company. But this time the situation was different. More than half of the money transferred was sent to various hedge funds, but the terms of the loan had previously been renegotiated for the benefit of Revlon. They say they see the money as an early payment of the debt they owe, and this year the judge decided they didn’t have to pay it back.

The big lesson here is at least to modernize the UI so that employees can perform their duties in a streamlined and consistent way. And even if people aren’t angry enough to take advantage of it, it’s not too painful to make a mistake.

Sacre bleu!Bank of France customers see each other’s accounts

A customer of the French bank LCL noticed that he was looking at someone else’s information just by logging in to his banking app on February 23rd. The term quickly spread on Twitter, and many speculated that it could be the result of a cyberattack. However, according to the bank itself, it was actually the result of a software error that was fixed within a day.

Of course, this type of development error is a sign of an internal failure of the company in which it occurred and should not occur, especially in the banking industry. Fallout shows the typical dance that follows this kind of mistake, and company negligence minimizes the problem. LCL said personal information was not disclosed and customers could only see other customers’ accounts and not send money. Hundreds of customers were affected. Others pointed out that transaction information could have been used to investigate the customer’s identity, and tens of thousands of users could have been logged in while the bug was running in live code. Did. In the end, LCL had to scramble to avoid heavy fines from European privacy regulators.

If the software keeps the cell door locked

In 2019, the Arizona State Parliament passed a law allowing prisoners in certain prisons convicted of nonviolent crimes to complete programming in state prisons to accelerate their release. However, a February whistleblower revealed that after more than a year, the software that tracks prisoners’ release eligibility has not yet been updated to comply with the new law. The state claims that qualified prisoners can manually recalculate their sentences and can actually recalculate, but the truth is, do you know that many are eligible for release? When they have the right to be freed by the law suffering in prison, because there are no outside defenders to claim their proceedings.

Here are some lessons about IT. The first is the importance of incorporating flexibility and extensibility into any system. Second, software is more than just software. It has a realistic and serious impact on human life. Finally, how the law can be implemented in the form of code, and the algorithms for enforcing the law need to be developed during the legislative process, rather than being left as written after it has already appeared in the book. There is the question of whether or not it exists.

Maine’s ancient personnel system is stalling

Maine’s personnel and salary status is, as Portland Press Herald explains, “operated by a 40-year-old system programmed in an abolished language and used by only one state employee. I know how. ” The system lasted longer than the 2016 attempt to replace the flopped system. Another attempt, which was scheduled to end in 2020, fell into mutual predicament this March as Workday, hired to deploy a new cloud-based system for Maine, left the project.

Deployment of ERP systems and similar platforms was notorious for being prone to disasters, and Maine’s salary needs were very complex (for example, state police have weapons, working on K9). If you are or are wearing scuba gear, your hourly wage will be different). The heart of the controversy is that anyone involved in such a large project should know. Maine says the system went online with an error rate of 50%, and Workday desperately full of errors in the main data imported into the system. More fundamentally, Maine seems to have hired staff who do not have the necessary skills to engage in the project, and the state is willing to pay enough to find workers who can perform well. There was no. Throwing nepotism and sexual harassment accusations disrupts IT management. Maine still uses the HR system 40 years ago.

Amazon vacation issues

If your takeaway from these previous two items is that the government is unable to do competent project management, we will enable private companies as well as Amazon, IT and the Web. The prototype of a super-efficient new economy.

According to a New York Times survey, the internal process for providing different types of vacations to Amazon employees is very broken. This led to a series of horror stories that affected white-collar and blue-collar workers as well. For example, an employee who was dismissed because he did not go to work despite taking an approved vacation, an injured worker with a disability who saw a mysterious reduction in salary by a new mother on maternity leave. The check disappeared, forcing me to sell my wedding ring in cash.

It turns out that Amazon uses multiple software products from different vendors to manage its vacation system. This is a legacy of rapid early growth. Therefore, the lesson here is that choices made early in the company’s history can reverberate years or even decades later. Similar to the Arizona prison system, Amazon is trying to make up for IT dysfunction with human labor. 67 full-time employees are dedicated to entering data about employee vacations.

Eat too much of your dog food

On October 4, people around the world could no longer access Facebook, Instagram, or WhatsApp because all services operated by the company now known as Meta were disconnected from the Internet. We don’t dig too deep into the actual cause of the crisis. This included an error in the Border Gateway Protocol that essentially disconnects Facebook services from other DNS systems on the Internet. Instead, I would like to focus on one detail that may be relevant to an IT shop, even if it is not one of the world’s largest tech companies.

Early in the outage, New York Times tech reporter Sheera Frenkel reported that Facebook employees were unable to enter headquarters because their ID badges couldn’t open the door. This prevented the technician from physically accessing the server needed to resolve the overall problem. Perhaps Facebook’s electronic door locks were … supplied by Facebook. Facebook seems to be quite obsessed with running all its internal systems on Facebook’s own infrastructure. In other words, the in-house communication system was also down and could not cope with the crisis. The industry term for companies that do this is “eat their dog food,” which is generally considered a vote of trust in their products, but Facebook’s catastrophe requires a reserve food supply. is showing.

Hidden bugs will go down soon

On June 8th, millions of Internet users trying to access sites ranging from Reddit to key UK government agencies faced a 503 error code, and the server hosting the website was able to process the request. Showed that it wasn’t. (Twitter was still working, but tragically, it can no longer display emoji.) Why are so many different sites going offline at once? The answer turned out to be related to the rise of content delivery networks. Content delivery networks deploy proxy servers at strategic points on the Internet to ensure clients have ultra-fast load times. Almost all big content sites are using CDNs these days, and there aren’t that many players in this space, so if one goes down, most of the internet can go with it.

In this case, the single point of failure was Fastly, an edge computing provider with a fast-growing CDN business. On May 12, we quickly released a software update containing bugs that could be caused by a particular customer configuration under the right conditions. On June 8th, customers unknowingly updated their configurations, creating a crisis at the crossroads of software development and industry integration.

Hit the messenger’s neck

In October, in collaboration with security expert Shaji Khan, St. A Louis Post-Dispatch reporter found that teachers’ social security numbers were incorrectly published on websites where the general public could search for teacher qualifications and qualifications. The number didn’t actually appear on the search results page itself, but it’s clear text in the page’s HTML, so it’s easy to find. Post-dispatch has given the state’s education department time to notify and fix deficiencies before the story is published. If the problem was there, I probably won’t talk about this story right now.

But two days after the Ministry of Education spokesperson began making a statement thanking the media for informing him of the problem (not sent), the governor called a “hacker” to embarrass him and the state government. Publicly accused the paper of hiring and investigated the crime. After doubling, he faced backlash and ridicule, including blowbacks from members of his own party, and we’re definitely talking now. So the lesson here is that how to deal with fallout from an IT disaster is just as important as the disaster itself.

