CrowdStrike CEO says resolving IT issues will “take some time”
7 hours ago
By Joe Tiddy, BBC World Service Cyber Correspondent
Getty Images
A Windows error screen caused problems for the Mercedes team ahead of the practice session for the Hungarian Grand Prix.
The boss of cybersecurity firm CrowdStrike has acknowledged that it may take “some time” for all systems to be restored after an update by the company caused a global IT outage.
Experts have warned that it could be days before large businesses return to normal.
They said a software fix for the issue already exists, but the manual process required is labor-intensive.
The global power outage has caused thousands of flights to be cancelled and affected all services, including banking, healthcare and retail.
The issue arose after an update from Crowdstrike caused Microsoft systems to “blue screen” and crash.
The problematic software was automatically sent to the company's customers overnight, affecting many of them when they arrived at work on Friday morning.
This means that you cannot restart the computer.
CrowdStrike CEO George Kurtz wrote to X: “The issue has been identified, isolated, and a fix has been distributed.”
In an interview with NBC's Today Show, Kurtz said the company was “deeply sorry for the impact this has had on our customers.”
“Many customers are restarting their systems and will have them back up and operational,” he said, but added that “for systems that don't recover automatically, it may take some time.”
The fix is not automatic and is referred to in the industry as a “fingers on the keyboard” solution.
“Affected systems will need to be booted into safe mode to remove the flawed update, as the system will no longer boot,” said researcher Kevin Beaumont.
“This is incredibly time consuming and would take an organization several days to implement at scale.”
Technical staff would then have to reboot all affected computers, which could be a huge task.
Crowdstrike is one of the largest and most trusted brands in cybersecurity.
The company has about 24,000 customers worldwide and potentially protects hundreds of thousands of computers.
In a message to customers on Friday, Kurz said the outage was not the result of a security or cyberattack, but was caused by a flaw in a “content update.”
“As we resolve this case, we intend to provide full transparency about how this happened and what steps we are taking to ensure this never happens again,” Kurtz wrote.
The issue is described as a “content update,” suggesting that the overnight update was meant to be a smaller one rather than a major update to cybersecurity software.
It could have been something as innocent as changing a font or logo on a software design.
This could explain why the software wasn't checked as rigorously as major updates, but it also raises the question of how a small update could cause so much damage.
Thousands of flights cancelled due to IT outage
One struggling IT manager said that once IT staff get to the computers, the process of getting them up and running again is quick, but the problem is getting them to the computers.
The person, who asked not to be named, said he manages 4,000 computers at an education company and that his team is working at full capacity.
“Using the command prompt as a workaround, we were able to repair all of our servers. But since many of our PCs are spread across five sites, it's not easy to do that manually. Any PCs that were left powered on overnight are affected and are being rebuilt,” he said.
IT experts say this manual process will be especially difficult for large organizations with thousands of computers that may have limited IT resources.
Small businesses that don’t have a dedicated IT team or that outsource their IT issues may also struggle.
Larger, more resourceful companies, like American Airlines, seem to be solving problems faster.
Interestingly, it seems like many people in the US might not be as affected, as they'll still be able to boot up computers that may not even be turned on and download the fixed software instead of the bad version, though it's still likely some manual intervention will be required.
Beaumont said one of the world's “most impactful IT incidents” was “caused by a cybersecurity vendor.”
Ironically, customers were affected by this because they had followed the usual advice issued by cybersecurity experts and installed security updates as they received them.
Security companies have accidentally sent out dangerous software updates in the past, but we've never seen anything on this scale or cause this much damage.
While the incident caused widespread disruption, the WannaCry cyber attack in May 2017 may have been even more sinister.
This was a malicious cyber attack that affected older versions of Microsoft Windows and automatically spread to all computers with old and unprotected Windows software installed.
An estimated 300,000 computers in 150 countries were affected.
The virus has battered the NHS for several days, affecting clinics and hospitals across the country.
In this case, it was an attack believed to be from North Korea that got out of hand.
The NotPetya attack, which occurred a month later, was eerily similar in method and damage.
Friday's stoppage, by contrast, was a mistake, not an offense.
