The global outage caused by a CrowdStrike update last week highlights the ongoing tension between cybersecurity and operations teams.
These teams agree that system availability is a top priority, but when patches or changes are made to harden the environment, availability can be adversely affected, even if no changes are made in the environment.
On the one hand, there is the case of CrowdStrike, where an update without proper testing caused downtime, the impact of which continues to ripple around the world. And let’s not forget the United Airlines software update in 2023, which caused flight cancellations and delays that the entire airline industry has experienced in recent days.
However, for Change Healthcare, the issue was that updates were not deployed. Change did not update critical systems to implement and enforce multi-factor authentication, a well-known best practice in some industries and a frequently enforced regulatory requirement. This failure to do so left the systems vulnerable to threats.
These outages represent two distinct but related challenges that must be delicately balanced.
Are your organizations making security changes as quickly as possible to avoid becoming the victim of a cyberattack, or are they postponing updates (sometimes indefinitely) until they have time to test how the patches will affect their systems?
Cybersecurity teams are focused on identifying vulnerabilities and weaknesses as quickly as possible and remediating them, especially if they pose a significant risk. These professionals fear a breach or successful attack, but typically do not have the authority to implement changes.
This is where operations teams come in: they often fear disruptions to operations caused by patches and other security changes more than they fear a successful intrusion.
As the CrowdStrike and Change Healthcare incidents demonstrate, both concerns are valid: Deploying patches without taking the time to properly test them can cause disruption and downtime, and waiting too long to deploy changes can allow a breach or attack to occur, causing disruption and downtime.
The National Institute of Standards and Technology (NIST) Special Publication 800-40rev4 describes this tension:
“What needs to change for many organizations is the perception that patching disruptions are a cost to the organization itself, while cybersecurity incident disruptions are a cost caused by a third party. While these may be true in isolation, they are misleading and incomplete as part of an organization’s risk response. Patching disruptions are largely controllable, while incident disruptions are largely uncontrollable. Patching disruptions are also a necessary part of maintaining nearly any type of technology to avoid major incident disruptions.”
Read the full NIST publication
But the question remains: how can organizations balance cybersecurity with operations?
The broadest answer is that organizations need to mature their vulnerability and patch management programs and supporting processes. Easier said than done, but publications such as NIST are a good starting point. Maturing these operations has a lot to do with how the environment is implemented. For example, do organizations know what their assets are? Are those assets configured according to security best practices?
Again, achieving a secure environment requires a balance between protection and operation. When organizations make changes to their systems to make them more secure, they must test how those changes affect functionality.
This incremental process should document exceptions where a change is simply incompatible with system operations, as well as other compensating measures to ensure balance.It is important to evaluate your vulnerability and patch management program even after it has reached maturity, and conduct testing so that your operations team can move to next steps with clarity and confidence.
Many automated tools, such as configuration scanners, can help detect issues that need to be fixed, and some can make deploying patches and changes easier, but there are still significant gaps in the area of test automation for operations teams.
Assessments often take up time from operations teams and system administrators, but it doesn’t have to be by using a remote, secure test environment with a digital twin of your systems, creating customized AI and ML-driven functional tests, running tests before and after environmental changes, reporting on any issues that may arise with updates, and providing clear impact reports.
My company, CyDeploy, is trying to fill this gap – a future where change can be made quickly and reliably.
This is a guest post by Tina Williams-Koroma, founding CEO of Baltimore-based cybersecurity companies CyDeploy and TCecure.
Company: CyDeploy / National Institute of Standards and Technology
Before you go out…
To help keep our independent journalism strong, please consider supporting Technical.ly. Unlike most business-focused media, we don’t offer paid content. Instead, we rely on your personal and organizational support.
Three ways to support our work: Donate to the Journalism Fund. Charitable donations ensure our information is free and accessible to residents to find workforce programs and entrepreneurial paths. This includes charitable grants and individual tax-deductible donations from readers like you. Use our Preferred Partners. Our directory of curated providers offers high-quality recommendations on services readers need, and every referral supports our journalism. Use our services. If you need to get entrepreneurs and tech leaders to buy your services, are looking for technologists to hire, or want more professionals to know about your ecosystem, Technical.ly has the largest and most engaged audience in the Mid-Atlantic. We help businesses answer the big questions to tell their story, meet their community, and contribute. Chris Wink, Co-Founder and CEO, Technical.ly
Source link