Malware authors are continuously evolving their tactics to evade detection by security tools, and sandbox evasion techniques are a critical component of this cat-and-mouse game. In this comprehensive article, we’ll delve into the intricate details of the three primary categories of sandbox evasion techniques employed by modern malware, shedding light on the sophisticated methods used by threat actors to bypass these analysis environments. By understanding these evasion tactics, security professionals can stay one step ahead and fortify their defenses against advanced persistent threats.
The use of malware analysis sandboxes as the silver bullet against advanced, persistent threats became popular over a decade ago. Back then, malware authors had already found ways to evade tools based on static analysis (such as traditional antivirus software products) using techniques such as polymorphism, metamorphism, encryption, obfuscation and anti-reversing protection. As a result, malware analysis sandboxes are now considered the last line of defense against advanced threats.
The operating principle of a sandbox is simple – determine if a file is malicious or not based on its observed behavior in a controlled environment. The sandbox allows the malware to perform all of its malicious operations and records the resulting behavior. After some time, the analysis is stopped and the result is examined and scanned for typical malicious behavior patterns. Since detection is not based on signatures, sandboxes can even detect zero-day and targeted malware (which typically has never been seen before by security researchers or analyzed in an antivirus lab).
Obviously, behavior-based malware detection only works if the observed file actually performs malicious operations during its analysis. If – for whatever reason – no harmful operations are executed during the analysis, the sandbox concludes that the file under examination is benign. Malware authors are always looking for new, innovative ways to evade sandbox detection by concealing the real behavior of malware. We’ve grouped these approaches into three categories:
This first approach detects the presence of a sandbox by looking for small differences between a sandbox environment and a real victim’s system. If a sandbox is detected, malware usually reacts in one of two different ways: it either terminates immediately (which is in itself suspicious) or it shows non-malicious behavior and performs only benign operations. An example of this is shown in Figure 1, where the sample: attempts to detect if it is running inside a virtual machine (VM) and looks to see if there is an application sandbox running (Sandboxie).
We can see this in the VMRay Threat Identifier (VTI) details showing that VMRay identified the sandbox detection attempts and scored this behavior as highly malicious.
The second category of evasion techniques directly attacks and exploits weaknesses in the underlying sandbox technology or in the surrounding ecosystem. For example, we have recently seen a large volume of malware using Microsoft COM internally because most sandboxes cannot correctly analyze such samples.
Other malware will use obscure file formats that cannot be handled by the sandbox or they exploit the sandbox’s inability to process files that exceed a certain size. In Figure 2 we can see an example of malware ‘blinding the monitor’. Meaning it is – that is performing illegitimate API usage. This can be an effective method to hide from malware sandboxes that rely on a hook or driver injected into the target machine. However, since VMRay Platform does not use hooking, the evasion attempt was detected.
The third category represents malware that does not try to detect or attack the sandbox at all – Context-Aware Malware. Instead, it exploits the natural shortcomings of such automated systems. Because of the high volumes of unique malware seen in most environments, sandbox analysis systems usually only spend a few minutes on each file. Thus, by delaying the execution of a malicious payload by a certain amount of time, malware can remain undetected. Besides time-triggers, malware can also use other events that usually do not occur in a sandbox, e.g. a system reboot or user interaction. Additionally, the malware may be looking for specific artifacts present on the intended target machine, such as an application or localization setting.
In Figure 3, we see an analysis where the malware, in addition to attempting to detect a VM environment, engaged in ‘persistence’, installing startup scripts and applications to survive reboot:
There are a number of techniques to identify the existence of a sandbox. Once detected, the malware can react in different ways. The simplest step is to immediately terminate. This can raise a red flag since this is not the behavior of a normal, benign program. Another action is to show a bogus error message. For example, the malware may display a message that a certain system module is missing or the executable file has been corrupted. More sophisticated malware may perform some benign operations to conceal the real intention. Let’s take a deeper look into the different techniques used by malware in the wild to detect if it is being executed in a sandbox:
This is one of the oldest evasion techniques. However, it is less relevant today as many production environments (workstations and servers) are virtualized anyway and virtual machines (VMs) are no longer only used by researchers and malware analysts. The earliest approach detected technical artifacts that existed due to the lack of full hardware support for virtualization (Paravirtualization). These techniques include:
These techniques are not very effective today. With hardware virtualization support, there are very few visible artifacts (if any) inside the VM since most hardware aspects are now virtualized and handled by the CPU itself. Therefore, they do not have to be simulated by the hypervisor.
We’ve published two analyses demonstrating a couple of types of virtualization detection. In the first, we see an attempt to detect if the malware is running inside VirtualPC:
Another approach is to detect the presence of a VM by looking at registry values. In this example, the malware queried the registry key “HKEY_LOCAL_MACHINE\SOFTWARE\” to look for values associated with common VM implementations like VMWare:
In this approach, it is not the hypervisor that the malware is trying to detect, but the sandbox itself. This can be done either by two of the following techniques:
An example of vendor-specific detection can be seen in Figure 3, where the malware looks for the presence of the module ‘SbieDll.dll” – an indicator that it would be running in under Sandboxes, a common sandboxing environment:
Sandboxes are usually not production systems but specifically set up for malware analysis. Hence, they are not identical to real computer systems and these differences can be detected by malware. Differences may include:
To demonstrate, we’ll go back to the same analysis we looked at in Figure 2. In addition to checking for VM presence, the malware is looking for the presence of Wine, a software emulator (that is, it emulates Windows functions, rather than CPU emulation). We can see here in the VTI Score that the malware is doing a query, GET_PROC_ADDRESS and attempting to determine from the returned result if it what would be expected in a Wine environment:
Monitoring the behavior of an application comes with a timing penalty, which can be measured by malware to detect the presence of a sandbox. Sandboxes try to prevent this by faking the time. However, malware can bypass this by incorporating external time sources such as NTP.
In Figure 5 you can see an example of timing-based detection. The VMRay Analyzer Report shows that the sample checked for rdtsc, the time-stamp counter.
In order to evade these types of detection by malware, an analysis environment should:
In particular, a common approach for sandbox analysis is hooking. That presence of a hook (the injected user-mode or kernel-level driver that monitors and intercepts API calls and other malware activity) is a telltale sign for malware. It is virtually impossible to completely hide the presence of a hook.
While a perfectly-implemented emulation environment will be, in theory, difficult to detect, this is a complex undertaking. Just as all software has bugs, it’s a near certainty that any given emulation environment will have flaws that can be detected.
If the malware sandbox can run an image copied from actual production endpoints, then the risk of detection falls dramatically. Coupling that with randomization of the environment helps to ensure that there are no tell-tale signs for malware to identify the target environment as ‘fake’
VMRay’s technology ensures that there is a minimal attack surface for malware to detect it is running in a sandbox. By not modifying the target environment, not relying on emulation, and allowing real-world images to run as target environments, VMRay gives nothing for malware to flag as a sandbox environment.
We wrote that the use of malware sandboxes as the silver bullet against advanced, persistent threats became popular over a decade ago. Back then, malware authors had already found ways to evade tools based on static analysis (such as traditional antivirus software products) using techniques such as polymorphism, metamorphism, encryption, obfuscation and anti-reversing protection. Malware analysis sandboxes doing behavior-based detection are now considered the final layer of defense against advanced threats.
Obviously, behavior-based malware detection only works if the observed file actually performs malicious operations during its analysis. If – for whatever reason – no harmful operations are executed during the analysis, the sandbox concludes that the file under examination is benign. In the second part of the series, we did a deep dive into how malware can directly detect the presence of a sandbox environment. Let’s now look at how malware can exploit gaps in the sandbox environment, rather than explicitly detecting the presence of a sandbox.
Explicitly searching for the existence of a sandbox can be detected as a suspicious activity during analysis. A more advanced approach for malware, therefore, exploits weaknesses in the sandbox technology to perform operations without being detected. By exploiting these sandbox weaknesses, malware does not have to worry about being detected even if it is being executed in a sandboxed system. Some of the techniques include:
Most sandboxes do in-guest-monitoring, (i.e., they place code, processes, and/or hooks) inside the analysis environments. If these modifications are undone or circumvented, the sandbox is blinded – in other words, visibility into the analyzed environment is lost. This blinding can take the following forms:
Hooks can be removed by restoring the original instruction or data.
Hooks can be circumvented by using direct system calls instead of APIs, calling private functions (which are not hooked), or performing unaligned function calls (skipping the “hook code”). We can see an example of this in Figure 1 where illegitimate API usage is utilized by the malware. While hooks could solve this problem for these particular internal functions, there are many of these present in the operating system and they vary with each Windows version. Furthermore, the problem of unaligned function calls cannot be adequately solved by hooking.
Hooks usually reside in the system files that are mapped into memory. Some malware will unmap those files and reload them. The newly loaded file version is then “unhooked”.
Many sandboxes are not capable of monitoring kernel code or the boot process of a system.
Many sandboxes do not support all file formats. Powershell, .hta, and .dzip are examples of some file formats that may slip by and simply fail to execute in a sandbox environment.
While the initial infection vector (say, a Word document with a macro) may open and the macro run in the sandbox, the macro will download and run a payload that uses an obscure technology hidden from the analysis. COM, Ruby, ActiveX, JAVA are some examples that we’ve analyzed in previous blog posts.
Many sandboxes cannot survive a reboot. Some systems try to emulate a reboot by re-logging in the user. This can be detected, however, and not all triggers of a reboot are executed.
By simply overwhelming the target analysis environment, malware can also avoid analysis with this crude but sometimes effective approach. For example,
In order to ensure malware cannot evade analysis by these methods a sandbox analysis environment should:
In particular, a common approach for sandbox analysis is hooking. That presence of a hook (the injected user-mode or kernel-level driver that monitors and intercepts API calls and other malware activity) gives malware the opportunity to disable analysis.
For efficiency and convenience, many sandboxes have a ‘one size fits all’ approach. A single type of target environment is used for all analyses. A better approach is to use the actual gold images (that is, the standard and server OS and application configurations that your enterprise uses) as the target environment. That way, you can be assured that any malware that is targeting your enterprise and could run on your desktops or servers will also run in the analysis environment.
Some malware sandboxes, particularly those using a hooking-based approach, take shortcuts and compromises for the sake of efficiency in determining what activity is monitored. This can leave blind spots.
VMRay’s technology accommodates all these scenarios. When used in conjunction with using real-world VM images as the target analysis machines, VMRay Analyzer will give full visibility into malware activity, regardless of attempts by the malware to obfuscate its intentions.
This is our final part in a series on sandbox evasion techniques used by malware today. We started with a primer, and then covered the two main categories of evasion techniques sandbox detection, and exploiting malware sandbox gaps.
In this part, we will be highlighting the context-aware evasion techniques that: use time, event, and environment-based triggers that are activated during sandbox analysis).
This category of evasion techniques, like exploiting sandbox technology gaps, does not try to detect a sandbox. Nor does it try to conceal malicious behavior by circumventing a sandbox or exploiting a sandbox’s weaknesses.
Instead, it delays or postpones its malicious payload until a certain trigger/event occurs. The trigger that is chosen is very unlikely to be activated inside a sandbox. Triggers can be grouped into four categories:
One of the most common techniques is to delay execution for a certain amount of time since sandboxes usually run samples only for a few minutes. As with many other evasion techniques, the utilization of time bombs, in particular, is an ongoing cat and mouse game: the malware goes asleep, the sandbox tries to detect sleep and shorten the time, malware detects shortened time, the sandbox tries to hide time forward by also updating system timers and so on. Time bomb techniques include:
For example, Figure 1 shows a Pafish test running VM detection of certain artifacts that often exist in analysis environments. Note the timestamp checks. Malware will also run checks like this and if a difference is found in the counters, shut down on the assumption it is running inside an analysis environment.
The malware only becomes active only on shutdown, after reboot, or when someone logs on or off. Figure 2 shows an example of this, where a second-stage payload is pulled down only after a reboot. We can see in the VTI score that an executable is installed by the malware (the initial payload) that will run automatically on startup after reboot. It’s this startup process that fetches the second payload.
Sophisticated targeted malware only works on the intended target system. The identification is usually based on the current username, time zone, keyboard layout, IP address, or some other system artifacts. The check itself can be done in various ways, ranging from simple to very complex methods.
The malware will only proceed to the second stage (that downloads the main payload) if it determines it is in the expected target environment.
Related to this is the inverse scenario where the malware detects that the environment is most likely an artificial analysis environment. This can be the result of checks such as:
Of the three categories of sandbox evasion techniques we have blogged about, context-aware malware is the least sensitive to the underlying malware sandbox technology. As sandbox technology improves and finds ways to circumvent sandbox detection, environmental triggers will become increasingly important to malware authors.
It is critical for security teams to ensure they are using target analysis environments that accurately replicate in every detail the actual desktop and server environments they are protecting. Furthermore, as we wrote previously, it’s important to have pseudo-random attributes as part of the target analysis environment.
Generic sandboxes running identical standard target environments are no longer sufficient. Further, the analysis environment needs to be able to detect environment queries and identify hidden code branches. VMRay Analyzer has the ability to randomize analysis environments, including when desktop or server gold images are used as the targets. Additionally, VMRay Analyzer will flag when malware is making environment queries. Combined, these ensure that security teams get the full picture and know when they are dealing with context-aware malware.