In this blog post, we will perform a deep analysis into GootLoader, malware which is known to deliver several types of payloads, such as Kronos trojan, REvil, IcedID, GootKit payloads and in this case Cobalt Strike.
In our analysis we’ll be using the initial malware sample itself together with some malware artifacts from the system it was executed on. The malicious JavaScript code is hiding within a jQuery JavaScript Library and contains about 287kb of data and consists of almost 11.000 lines of code. We’ll do a step-by-step analysis of the malicious JavaScript file.
TLDR techniques we used to analyze this GootLoader script:
Filename: sample_supplier_quality_agreement 33187.js
MD5: dbe5d97fcc40e4117a73ae11d7f783bf
SHA256: 6a772bd3b54198973ad79bb364d90159c6f361852febe95e7cd45b53a51c00cb
File Size: 287 KB
To find the trojan downloader inside this JavaScript file, the following grep command was executed:
grep -P "^[a-zA-Z0-9]+\("
This grep command will find entry points that are calling a JavaScript function outside any function definition, thus without indentation (leading whitespace). This is a convention that many developers follow, but it is not a guarantee to quickly find the entry point. In this case, the function call hundred17(3565) looks out of place in a mature JavaScript library like jQuery.
When tracing the different calls, there’s a lot of obfuscated code, the function “color1” is observed Another way to figure out what was changed in the script could be to compare it to the legitimate version[1] of the script and “diff” them to see the difference. The legitimate script was pulled from the jQuery website itself, based on the version displayed in the beginning of the malicious script.
Before starting a full diff on the entire jQuery file, we first extracted the functions names with the following grep command:
grep 'function [0-9a-zA-Z]'
This was done for both the legitimate jQuery file and the malicious one and allows us to quickly see which additional functions were added by the malware creator. Comparing these two files immediately show some interesting function names and parameters:
A diff on both files without only focusing on the function names gave us all the added code by the malware author.
Color1 is one of the added functions containing most of the data, seemingly obfuscated, which could indicated this is the most relevant function.
The has6 variable is of interest in this function, as it combines all the previously defined variables into 1:
Further tracing of the functions eventually leads to the main functions that are responsible for deobfuscating this data: “modern00” and “gun6”
The deobfuscation algorithm is straightforward:
For each character in the obfuscated string (starting with the first character), add this character to an accumulator string (initially empty). If the character is at an uneven position (index starting from 0), put it in front of the accumulator, otherwise put it at the back. When all characters have been processed, the accumulator will contain the deobfuscated string.
The script used to implement the algorithm would look similar to the following written in Python:
CreateObject, observed in the deobfuscated script, is used to create a script execution object (WScript.Shell) that is then passed the script to execute (first script). This script (highlightd in white) is also obfuscated with JavaScript obfuscation and the same script obfuscation that was observed in the first script.
Deobfuscating that script yields a second JavaScript script. Following, is the second script, with deobfuscated strings and code, and “pretty-printed”:
This script is a downloader script, attempting to initiate a download from 3 domains.
The HTTPS requests have a random component and can convey a small piece of information: if the request ends with “4173581”, then the request originates from a Windows machine that is a domain member (the script determines this by checking for the presence of environment variable %USERDNSDOMAIN%).
The following is an example of a URL:
hxxps://www[.]labbunnies[.]eu/test[.]php?cmqqvfpugxfsfhz=71941221366466524173581
If the download fails (i.e., HTTP status code different from 200), the script sleeps for 12 seconds (12345 milliseconds to be precise) before trying the next domain. When the download succeeds, the next stage is decoded and executed as (another) JavaScript script. Different methods were attempted to download the payload (with varying URLs), but all methods were unsuccessful. Most of the time a TCP/TLS connection couldn’t be established to the server. The times an HTTP reply was received, the body was empty (content-length 0). Although we couldn’t download the payload from the malicious servers, we were able to retrieve it from VirusTotal.
We were able to find a payload that we believe, with high confidence, to be the original stage 2. With high confidence, it was determined that this is indeed the payload that was served to the infected machine, more information on how this was determined can be found in the following sections. The payload, originally uploaded from Germany, can be found here: https://www.virustotal.com/gui/file/f8857afd249818613161b3642f22c77712cc29f30a6993ab68351af05ae14c0f
MD5: ae8e4c816e004263d4b1211297f8ba67
SHA-256: f8857afd249818613161b3642f22c77712cc29f30a6993ab68351af05ae14c0f
File Size: 1012.97 KB
The payload consists of digits. To decode it, take 2 digits, add 30, convert to an ASCII character, and repeat this till the end of the payload. This deobfuscation algorithm was deduced from the previous script, in the last step:
As an example, we’ll decode the first characters of the strings in detail: 88678402
This results in: “var “, which indicates the declaration of a variable in JavaScript. This means we have yet another JavaScript script to analyze.
To decode the entire string a bit faster we can use a small Python script, which will automate the process for us:
First half of the decoded string:
Second half of the decoded string:
The same can be done with the following CyberChef recipe, it will take some time, due to the amount of data, but we saw it as a small challenge to use CyberChef to do the same.
#recipe=Regular_expression('User%20defined','..',true,true,false,false,false,false,'List%20matches')Find_/_Replace(%7B'option':'Regex','string':'%5C%5Cn'%7D,'%2030,',true,false,true,false)Fork(',','',false)Sum('Space')From_Charcode('Comma',10)
The decoded payload results in another JavaScript script.
MD5: a8b63471215d375081ea37053b52dfc4
SHA256: 12c0067a15a0e73950f68666dafddf8a555480c5a51fd50c6c3947f924ec2fb4
File size: 507 KB
The JavaScript script contains code to insert an encoded PE file (unmanaged code) and create a key with as value as encoded assembly (“HKEY_CURRENT_USER\SOFTWARE\Microsoft\Phone”) and then launches 2 PowerShell scripts. These 2 PowerShell scripts are fileless, and thus have no filename. For referencing in this document, the PowerShell scripts are named as follows:
A custom script was utilized to decode this payload as a whole and extract all separate elements from it (based on the reverse engineering of the script itself). The following is the output of the custom script:
All the artifacts extracted with this script match exactly with the artifacts recovered from the infected machine. These can be verified with the fileless artifacts extracted from Defender logs, with matching cryptographic hash:
Based on this information, it can be concluded, with high confidence, that the payload found on VirusTotal is identical to the one downloaded by the infected machine: all hashes match with the artifacts from the infected machine.
In addition to the evidence these matching hashes bring, the stage 2 payload file also ends with the following string (this is not part of the encoded script): @[email protected] This is the random part of the URL used to request this payload. Notice that it ends with 4173581, the unique number for domain joined machines found in the trojanized jQuery script.
Although VirusTotal has reports for several URLs used by this malicious script, none of the reports contained a link to the actual downloaded content. However, using the following query: content:”378471678671496876716986″, the download content (payload) was found on VirusTotal; This string of digits corresponds to the encoding of string “CreateObject”. (see Fig. 20)
In order to attempt the retrieval of the downloaded content, an educated guess was made that the downloaded payload would contain calls to function CreateObject, because such functions calls are also present in the trojanized jQuery script. There are countless files on VirusTotal that contain the string “CreateObject”, but in this particular case, it is encoded with an encoding specific to GootLoader. Each letter of the string “CreateObject” is encoded to its numerical representation (ASCII code), and subtracted with 30. This returns the string “378471678671496876716986”.
MD5 Assembly: d401dc350aff1e3fd4cc483238208b43
SHA256 Assembly: f1b33735dfd1007ce9174fdb0ba17bd4a36eee45fadcda49c71d7e86e3d4a434
File Size: 13.50 KB
This .NET loader is fileless and thus has no filename.
The PowerShell loader script (powershell_loader)
The .NET Loader is encoded in hexadecimal and stored inside the registry. It is slightly obfuscated: character # has to be replaced with 1000.
The .NET loader:
The DLL is encoded in hexadecimal, but with an alternative character set. This is translated to regular hexadecimal via the following table:
This Test function decodes the DLL and executes it in memory. Note that without the .NET loader, statistical analysis could reveal the DLL as well. A blog post[2], written by our colleague Didier Stevens on how to decode a payload by performing statistical analysis can offer some insights on how this could be done.
MD5 DLL: 92a271eb76a0db06c94688940bc4442b
SHA256 DLL: 63bf85c27e048cf7f243177531b9f4b1a3cb679a41a6cc8964d6d195d869093e
This is a typical Cobalt Strike beacon and has the following configuration (extracted with 1768.py)
Now that Cobalt Strike is loaded as final part of the infection chain, the attacker has control over the infected machine and can start his reconnaissance from this machine or make use of the post-exploitation functionality in Cobalt Strike, e.g. download/upload files, log keystrokes, take screenshots, …
The analysis of the trojanized jQuery JavaScript confirms the initial analysis of the artifacts collected from the infected machine and confirms that the trojanized jQuery contains malicious obfuscated code to download a payload from the Internet. This payload is designed to filelessly, and with boot-persistence, instantiate a Cobalt Strike beacon.
About the authors
Didier Stevens | Didier Stevens is a malware expert working for NVISO. Didier is a SANS Internet Storm Center senior handler and Microsoft MVP, and has developed numerous popular tools to assist with malware analysis. You can find Didier on Twitter and LinkedIn. |
Sasja Reynaert | Sasja Reynaert is a forensic analyst working for NVISO. Sasja is a GIAC Certified Incident Handler, Forensics Examiner & Analyst (GCIH, GCFE, GCFA). You can find Sasja on LinkedIn. |
You can follow NVISO Labs on Twitter to stay up to date on all our future research and publications.
[1]:https://code.jquery.com/jquery-3.6.0.js
[2]:https://blog.didierstevens.com/2022/06/20/another-exercise-in-encoding-reversing/