Last Friday, the world experienced the largest ever global outage of key infrastructure for Windows PCs. The issue was caused by a failed update to CrowdStrike’s kernel-level Falcon Sensor software, which caused modern Windows systems to fundamentally fail, causing flight delays around the world. Southwest Airlines avoided the issue because they were using Windows 3.1 instead of the latest version of the OS.
But as it turns out, the issue isn’t just limited to modern Windows operating systems: Linux users have been reporting kernel panics and crashes related to the same software since around April of this year, according to a report from The Register.
So how can this issue be cross-platform? The particular issue that caused the chaos of the past few days may not be cross-platform. After all, if that were the case, Windows machines would have stopped working sooner. But what this does show is that CrowdStrike has been neglecting its Falcon Sensor Security software for quite some time.
For those unfamiliar, the “kernel” of an operating system refers to the outer layer of user interaction (usually called the “shell”), and is the one that is most directly connected to the hardware underneath. In reality, very little computer software requires access to the kernel to do its work – and security software is a possible exception, as threats often try to get into the kernel – but it is very important that software does not also cause instability or crashes in the target platform’s kernel.
An interesting aside pointed out by The Register is that CrowdStrike’s current CEO, George Kurtz, was also the CEO of McAFee during the infamous 2010 update that caused multiple PCs to enter an infinite boot loop. This could make George Kurtz the first CEO in history to oversee two major global PC outages caused by improper security software updates.
Affected Linux users reportedly include those using Red Hat Enterprise Linux, Debian Linux (on which the more popular Ubuntu is based), and Rocky Linux, but all of the issues in question seem to affect the underlying Linux kernel (which is common across Linux distributions), and will crash any Linux distribution using kernel version 5.14.0-42713.1 or newer.
Linux users appear to have more recourse against such issues, including switching to eBPF “user mode”, but CrowdStrike’s success in crippling Linux and Windows operating systems speaks to the severity of the company’s kernel software development issues.
It also shows that there were warning signs in this past global outage, and that CrowdStrike should have long ago put in place a system of thorough testing of these enterprise and government-targeted updates to prevent kernel-level crashes. After all, most affected users in these tightly controlled environments likely do not have the administrative access or knowledge necessary to fix issues after they occur. In other words, significantly improved QA testing appears essential to CrowdStrike’s continued long-term success.