How NOT to f-up your security incident response

Feature Experiencing a ransomware infection or other security breach ranks among the worst days of anyone’s life — but it can still get worse.

Like if you completely and utterly stuff up the incident response investigation and that snafu adds millions of dollars more in damages costs to the overall bill.

In one such incident, Jake Williams, VP of research and development and cybersecurity consulting biz Hunter Strategy, says he was called in to clean up a client’s hot mess of a forensics report, prompting a frustrated social media post “imploring” companies: “This is NOT something you can just DIY.”

The mishaps made in this investigation “are easily a seven-figure mistake,” he added.

Williams, probably best known by his social media moniker MalwareJake, used to work as a US National Security Agency hacker and also serves as an IANS Research faculty member.

The errors that occurred during the incident response and which were revealed in the subsequent forensic report stem from “a big issue of confirmation bias,” Williams told The Register. “The report reads like they formed a theory about what happened, and then spent a bunch of time going and searching for evidence that supported their conclusions.”

He declined to name the targeted organization, confirming only that it was a Fortune 1,000 company, so “big enough that I would have expected a bit more rigor in the forensic analysis.”

Both the CISO and CIO were fired over the security incident, during which digital intruders exploited a combination of SQL injection and directory traversal bugs to break in and compromise a number of servers. One of these was internet-facing.

Confirmation bias

“Occam’s Razor says that [the internet-facing device] probably is the zero patient, and they scoured the logs on this public-facing server until they found something that they thought was related to or evidence of compromise,” Williams said.

“The unfortunate reality is it was not the zero patient, even though it was internet-facing,” he continued. “It was one the threat actor laterally moved to after having been in the network for over a month.”

Williams says it took him a couple of hours of analysis to determine the internet-facing server wasn’t the initial access point. “Looking at their report, it was pretty obvious that they were trying to cherry-pick a piece of data and say that is evidence of exploitation.”

This particular incident highlights one of the most common mistakes that organizations make: not allocating enough time to an investigation, and failing to incorporate new evidence, according to Williams and other incident responders interviewed by The Register.

Many customers behave similar to patients who have just received a sobering medical diagnosis

And, of course, every organization and executive reacts differently upon realizing that someone has broken into their systems.

“Many customers behave similar to patients who have just received a sobering medical diagnosis,” Microsoft’s Director of Incident Response Ping Look told The Register.

Their reaction depends on “how prepared they are to receive the diagnosis — how much experience they may have and how much information they have available,” she added. “Many organizations understandably do not know what to do or where to start.”

The first challenge that most orgs face, according to Mandiant Consulting CTO Charles Carmakal, involves “not properly scoping out the investigation and being too narrowly focused.”

Stop, drop and scope

This narrow focus may be a result of an exec or insurance company trying to minimize investigation costs. Or it could be due to an incorrect theory — about how or where the intruders gained access to the network, for example.

“Maybe you think the incident is limited to a particular system or environment,” Carmakal told The Register. “But it could be broader. And the risk of not properly scoping out an incident is not finding backdoors or credentials that were stolen by an adversary that could be used to re-compromise the environment.”

Immediately after a cyberattack, when every second counts, and teams are scrambling to understand what happened while also getting vital systems back up and running, companies commonly rush to remediate, said James Perry, CrowdStrike VP of global digital forensics & incident response.

When this happens, it’s easy to miss or fail to preserve key evidence. This is understandable. “Getting back to business operations is critical,” Perry told The Register. “But rebooting systems, wiping machines or making changes too quickly can erase crucial forensic data. Without it, determining the full scope of a breach becomes significantly harder, if not impossible.”

Create a timeline

And don’t skip out on doing an incident report, either. Not having this detailed timeline laid out in front of you, in writing, makes it really hard to identify gaps in your understanding of the attack.

“As you start to write things down on paper, you create a timeline, you create an access propagation diagram that shows how the attacker went from system A to system B to system C to all the way down to System Z,” Carmakal said.

Additionally, sometimes the people who are directly involved in the incident response aren’t the same one doing the hands-on remediation activities.

“Then you have a lost-in-translation situation, where, if the guidance was verbal and not written on paper, you’re very likely to miss some of the important nuances,” Carmakal added.

Ransomware has entered the building

Ransomware attacks present their own unique challenges and can be especially taxing for organizations struggling to recover their systems while also containing and mitigating the infection.

“The pressure is immediate. Systems are down, operations are disrupted, and leadership wants answers and to get back up and running quickly,” CrowdStrike’s Perry said, adding that many company don’t have tested response plans or decision-making frameworks for the aftermath of these data encrypting events.

“This often leads to rushed or poorly coordinated actions, such as attempting partial restorations without fully understanding the scope of the compromise, which can cause reinfection or data loss,” he explained.

Visibility can also be an issue, because most ransomware groups steal sensitive data before they lock it up, and then extort the organization to prevent it from being leaked.

“Many organizations lack the forensic capability to determine what data was exfiltrated, when, and by whom,” Perry said. “Log retention issues, incomplete network monitoring, and the disabling of security tools by attackers further complicate investigations.”

Without a solid grasp of the full attack chain, companies struggle to assess the true impact, notify affected stakeholders, and meet compliance requirements — all while trying to restore business operations under intense pressure

“Without a solid grasp of the full attack chain, companies struggle to assess the true impact, notify affected stakeholders, and meet compliance requirements — all while trying to restore business operations under intense pressure,” he added.

A common worst-case scenario for organizations is when ransomware becomes destructive, “meaning everything within the organization’s environment comes to a screeching halt,” according to Microsoft’s Look. “In situations like these organizations cannot conduct any business internally, nor can their customers access their accounts.”

Infected companies usually don’t want to pay the ransom demand, which pumps more money into the criminal ecosystem. Plus, responding to these incidents typically takes longer.

“This can keep the organization in a disrupted, frozen state that is stressful for employees, customers and investors,” Look said. “In those scenarios, the organization might realize they are understaffed and do not have an updated cyber resilience plan.

IR teams don’t work in a vacuum

It’s also important to realize that IR teams “do not have the luxury of working in a vacuum.” Boards of directors, insurance companies, the media, regulators, and even law enforcement all want status updates.

“That type of pressure creates a lot of chaos if organizations do not account for it in crisis planning and it can really impact teams’ ability to prioritize where to focus their energies,” Microsoft’s Look said.

All of the experts interviewed for this story stressed the importance of maintaining an up-to-date and well-rehearsed cyber resilience plan when asked about the top advice they dole out to companies on how not to f-up your incident response.

They also emphasized calling in professionals and not relying on an existing IT or managed services provider if hit by a major attack.

Yes, this is self-serving, as these are all top-tier incident responders who are regularly called to investigate nation-state attacks and major ransomware infections — or to clean up messes made by earlier IR firms.

Still, there’s something to be said for bringing in the big guns should you find yourself in the middle of a really bad breach. And seeing as they are experts in the field, their advice on how to screw up an incident response (IR) is solid.

‘Be IR ready’

“Top advice from Microsoft Incident Response: Be IR ready,” Look said, adding this means “having a current incident response plan that is both regularly rehearsed, and able to be updated. You also want to have an incident response retainer already in place, so that services you may need to navigate the incident and any potential legal, insurance and other fallout are available to you on-demand.”

If your company is receiving security services from more than one vendor during an incident, ask them to share information and work together.

“Some companies think they are protecting themselves by keeping all the vendors apart, but security is a team sport,” Look told us. “The more knowledge sharing and collaboration, the faster and more effective an investigation can typically take place.

She also offers some secondary advice: if you are using old, outdated systems that no longer receive security updates and vendor support, develop a plan to invest in modernization. Yes, this is expensive, but it will reduce your organization’s attack surface, thus making it more difficult for digital intruders to break in, and this will save money down the road.

Also, “never waste an opportunity to think about rebuilding your systems,” Hunter Strategy’s Williams said. “It’s fairly rare for CISOs and CIOs today to get fired over a single incident, unless there’s broad incompetence that led to it.”

However, it is fairly common for organizations to find themselves compromised a second time after the first security incident, only to find out that the intruders hadn’t been fully kicked off their systems in the first place. And this is a much more serious offence for the security team who handled the first incident response and mitigation efforts, not to mention the bad press and loss of brand reputation following two data breaches in short succession.

“I always try to make it personal for folks and say, hey, look, you’re taking a huge risk, a personal risk and a career risk, by not rebuilding,” Williams says. “People try to clean malware off of systems rather than rebuilding systems. But you just can’t ever deem a system clean once a threat actor has been on it.”

CrowdStrike’s Perty said the most important advice he gives to companies following an incident is “slow down and take a methodical approach — even under pressure. It’s natural to want to jump straight into remediation, but without a structured response, you risk destroying critical forensic evidence or missing key indicators that could reveal the full scope of the attack.”

Before making any changes, be sure to capture volatile data, preserve logs and document everything, he added.

And if you don’t have an IR plan in place, there’s no time like the present. Just make sure it doesn’t sit in a corner and gather dust. As Perry noted: “Practice makes perfect.” ®

Source