NVISO employs several hunting rules in multiple Threat Intelligence Platforms and other sources, such as VirusTotal. As you can imagine, there is no lack of APT (Advanced Persistent Threat) campaigns, cybercriminals and their associated malware families and campaigns, phishing, and so on. But now and then, something slightly different and perhaps novel passes by.
In this blog post, we’ll describe such a campaign which we assess has been created by an actor with at least a medium level of technical competence due to multiple obfuscation layers in ultimate payload delivery.
Stage 1 – Initial Lure
The initial lure consists of a classic “bait” scheme involving a so called voicemail left by a caller, and urging the recipient to download a ZIP file containing HTML attachment as shown in Figure 1.
Figure 2 displays the further email context.
From the email context, it appears the email was sent from a legitimate business which has been compromised in order to send out the phishing campaign.
The subject of the email in turn is simply the name of the targeted company, and the attachment name follows the pattern of “Companyname Micro.protected.zip”. The ZIP contains a single HTML which follows the same naming pattern.
The HTML contains obfuscated JavaScript and once the intended target opens the HTML file, it will automatically perform several rounds of decoding, deobfuscation and decryption to eventually lead to a customised spearphishing page. This technique is also known as “HTML smuggling”, a technique where attackers embed certain code, such as JavaScript, into an HTML file which will be automatically rendered (loaded) by the browser.
The technique has gained popularity over the years as it may bypass certain security mechanisms such as email gateways or even endpoint detection systems as an HTML file typically does not contain malicious content.
Let’s dive into the next stages.
Stage 2 – The HTML and JavaScript
To follow along, the HTML we will analyse in the next sections has the following properties:
Note the sample has also been made available on MalwareBazaar.
As described previously, the JavaScript embedded in the HTML will be similar to what’s shown in Figure 3 below.
The script initialises an array of what appears a mix of binary (e.g. 1100011
) and hex values (e.g. 3c
). When dealing with JavaScript, the decoding functionality is typically near the end of the file, and this is no different in our case, as displayed in Figure 4:
The following snippet displays the beautified decoding function:
let decodedHtml = "";
for (let mixed of arr) {
let isHex = mixed[0] != '1';
if (isHex) {
decodedHtml += String.fromCharCode(parseInt(mixed, 16));
} else {
decodedHtml += String.fromCharCode(parseInt(mixed, 2));
}
}
document.write(decodedHtml)
JavaScript
The decoding function is straightforward: it will validate if a character is in hex or binary format and perform the decoding using fromCharCode
. Once completed, it will “write” the decoded content to the same HTML page and execute it.
When dealing with any type of JavaScript, there’s a classic set of clues that will indicate something malicious going on, such as eval
, document.write
, certain verbs and functions…
Decoding or deobfucation can happen in a myriad of ways, for example, we can redirect output to a message box, we can substitute an eval for an action, but in this case we can take a simple approach: just let the script decode itself!
Stage 3 – The JavaScript and… the JavaScript
But how can we let the script decode itself? The easiest way is to replace document.write
with console.log
. This will simply output or redirect the content of decodedHtml
, which ultimately contains the payload or next stage, to the console.
To perform this, open up your favourite browser: in our example, we’ll use Google Chrome. Note that you should still perform this kind of analysis in a secure environment such as a sandbox or Virtual Machine: some JavaScript payloads may contain an exploit, immediately download and execute malware (rather than portray a phishing page), or may contain some sneaky evals
that when missed, might alter the results you are expecting.
Open Google Chrome, go to Developer Tools (or press CTRL + SHIFT + i), open the Console tab. You may need to allow pasting, and finally we can add in the content of the HTML. Note you will need to remove the
<script>
tags for it to function correctly.
Replacing document.write
with console.log
yields:
Great success! It appears we were able to get the payload. However, a next encoded block awaits in value u
, and the script appears to call a file hosted on Cloudflare for further functionality.
As before, let’s go to the end of the script and we can observe the following function:
The same decoding function, beautified:
function p(d, h) {
const b = CryptoJS.enc.Base64.parse(d);
const m = CryptoJS.enc.Utf8.parse(h);
const v = CryptoJS.AES.decrypt({ ciphertext: b }, m, {
mode: CryptoJS.mode.ECB,
padding: CryptoJS.pad.Pkcs7
});
return v.toString(CryptoJS.enc.Utf8);
}
function q() {
const q = document;
const x = p(u, a);
q.open();
q.write(x);
q.close();
}
window.onload = q;
JavaScript
This script will define two constants with specific values, where:
u
: encoded block: large blob of base64 encoded and AES encrypted dataa
: hardcoded (static) AES key in plain textThere are also constants as seen in the code above and a function p
to interact with Cloudflare’s CryptoJS
library, which can be used to handle specific cryptographic operations. In this case, it is responsible for decrypting the AES-encrypted block after base64 decoding it. Since this implementation is AES in ECB mode (with PKCS7 padding), there is only a hardcoded key and no IV is needed. The key is 8527412153049366
.
Finally, function q
is responsible for writing and loading (opening) the decoded and decrypted content.
In the next section, let’s decode this script again like we did before (and perhaps, we’ll get to the final payload).
Stage 4 – The final payload?
Similarly to how we analysed the previous script, we want to log the output in the console, but there is an issue: calling certain scripting functionality is not possible via the Debugging tools in browsers and furthermore, there are some window.onload
events we may not want to execute. So what do we do?
Let’s use the same trick the attackers used by creating our own “HTML smuggling” to safely decode, decrypt and log the output.
First, define this is an HTML file and call the CryptoJS library. We have opted to download the same library from Cloudflare and store it offline, to ensure nothing goes out from our device to the internet – note you can reference the JS library from the online resource as well, however, working in an offline manner is more secure. You can place the newly created HTML smuggle and the CryptoJS library in the same directory for convenience sake (in our case in c:\demo).
Next, we can simply copy over the functionality from the malicious JavaScript, the base64 encoded content…:
And the AES key as well as the decoding & decryption functionality:
Finally, we want to log this to the browser’s console and execute the newly added logging function:
Don’t forget to include both the script and HTML closing tags as shown in Figure 10 above and save the file as decrypt_me.html for example.
Now open the file and the result should be as shown in Figure 11 below:
From Figure 11, we can already observe further JavaScript functionality (e.g. document.getElementById
, atob
, …) as well as what appear like pseudo-randomly named variables such as ficus
, penguin
, nutcracker
and so on.
In other words, more HTML smuggling! The final HTML file, again with included JavaScript content, is 131 lines long.
Going through the HTML file, we can observe several functionalities (i.e. to not have it indexed by search engines, event listeners, …) and as before, there is some obfuscation going on. That said, the only obfuscation leveraged pertains to base64 encoded values and data. An example is given in Figure 13 below:
The variables ficus
and yam
contain base64 decoded data that, when decoded via the atob
function (used legitimately in/by JavaScript to decode base64) will reveal https://login.microsoftonline.com/
and authorisation parameters such as common/oauth2/authorize/client_id=00000002-0000-0ff1-ce00-000000000000
.
Scrolling to the end of the HTML file will give us the following functionality:
A few variables are initiated and the HTML / JavaScript will attempt to load an iframe. An iframe can be used for several (malicious) purposes, however, in this case, it’s used to load a malicious URL – which will we be executed in tandem with the Microsoft Office 365 login page.
You may have also observed an additional variable mentioned jojoba
. This variable is defined earlier in the HTML code, and contains the base64 encoded email address of the target and is used as location hash and “identifier”. In other words, this user information will get sent to the attacker and is already an indication the target opened the HTML attachment.
We can decode this iframe safely as we have done before via the console, however, you must remove the insertAdjacentHTML
part as it will dynamically add the iframe content to the website displayed (invisibly or visibly) and therefor load it.
Finally, running the modified HTML will result in:
The iframe will try to load (base64 value containing the email address removed):
https[:]//href[.]li/?https[:]//9zg[.]aforenotedc[.]ru/oBVboBDZoE8nn9Lnp8eBs/
The iframe will redirect, via href.li to the malicious domain, 9zg[.]aforenotedc[.]ru
. This domain has the URI as seen above, and will display the legitimate Microsoft Office login page to the user, perform the credentials are valid, and will send the login data to this malicious domain as well.
The targeted user will have said login screen with the email address already filled in, or may receive a login prompt as follows:
Of note, the phishing campaign does employ a few minor tricks to attempt and evade detection systems and analysts, for example, the iframe was at time of analysis protected with Cloudflare:
The target will receive the same message when opening the HTML file and will have to go through the “human” verification before the Microsoft Office login page is displayed.
An additional verification will also happen to ensure the actual target has indeed opened the HTML: if not, a redirect to another, innocuous, website will happen (e.g. Yahoo, Bestbuy, …).
This brings us to the end of the analysis of the multiple stages of this campaign.
Campaign extends wider
We have identified 4 other samples, highly likely related to the same campaign. Interestingly, 2 of them contain only 1 obfuscation layer after executing the HTML. For example, when we decode the layer we can retrieve the HTML and JavaScript code as shown in Figure 19 and 20 below:
The decoded content in sample 2 as shown in Figure 20 above, contains the targeted user in clear text.
The 3rd sample has the same “double obfuscation” using base64 and AES (decryption via Cloudflare’s CryptoJS implementation, with AES key 6534135480761922
) and the output seems more complex, but it is not:
The “getScript” will simply decode the hex encoded values, leading to another phishing landing page with the target’s data already filled in. Oddly enough, the “t” variable is never called and as such, it might be an artefact forgotten or perhaps left over from testing.
The 4th sample follows the initial sample completely throughout the stages or execution chain, up until and including the loading of the iframe. A different AES key is used (6910525483127436
) and the target is different, otherwise, it is a unique match.
What all campaigns have in common is the stage 1 HTML smuggling and specifically, this part of the code as also outlined in Stage 2 in the beginning of this blog post.
let decodedHtml = "";
for (let mixed of arr) {
let isHex = mixed[0] != '1';
if (isHex) {
decodedHtml += String.fromCharCode(parseInt(mixed, 16));
} else {
decodedHtml += String.fromCharCode(parseInt(mixed, 2));
}
}
document.write(decodedHtml)
JavaScript
This leads us to believe that these smuggling files have been generated automatically where the attacker simply adds the target email address as parameter (and potentially a new AES key per campaign). Many HTML Smuggling projects exist on code sharing platforms such as Github, however, the attacker might also have developed their private / internal HTML Smuggling “generator”.
Conclusion
While HTML smuggling is perhaps not a novel approach, it does appear it is winning in popularity and specifically for phishing campaigns it is shown to be a simple yet effective solution, as payload delivery is often detected at the stage of an actual payload, such as an executable file.
Therefore, using HTML smuggling purely for spear phishing, layering the HTML with several encodings and obfuscations, may result in highly successful campaigns from the attacker’s end.
We recommend to ensure that all your defense mechanisms are able to handle advanced smuggling attempts such as the one described in this blog post. To give some additional pointers:
For analysing HTML smuggling and/or malicious JavaScript, several more robust tools such as SpiderMonkey are available with additional debugging functionality. That said, performing the analysis manually via the browser’s Console works just as as well.
Indicators are shared below.
Indicators
Indicator | Type | Comment |
287691ade84c692b9ea3af2bee22096d13584c817fcb7c908c3c4c17c582aa5f | SHA256 Hash | Initial campaign HTML smuggled file |
7c3769acab50337d09e80762b9c20329b117d94243878a2a2eb91fba4a211f23 | SHA256 Hash | Other campaign HTML smuggled file |
b068bef330741844293f2c8f8c68b4c6786cee208a164daf4160f48bb196fa4b | SHA256 Hash | Other campaign HTML smuggled file |
27987136e29b3032ad40982c8b7c2e168112c9601e08da806119dcba615524b5 | SHA256 Hash | Other campaign HTML smuggled file |
0be8f379292feb4f8dbe1465bb0e81357ca1f3e7bec795d7798685c547db9f9e | SHA256 Hash | Other campaign HTML smuggled file |
https[:]//9zg[.]aforenotedc[.]ru/oBVboBDZoE8nn9Lnp8eBs/ | URI | Initial campaign iframe URI / phishing URI |
9zg[.]aforenotedc[.]ru | Domain | Initial campaign domain |
aforenotedc[.]ru | Domain | Initial campaign related domain |
https[:]//www[.]bandptrade[.]com/o/ | URI | Other campaign phishing URI |
https[:]//mallmentum[.]com/arull[.]php | URI | Other campaign phishing URI |
https[:]//2EXINIEH2TVN7QC66D58083D8A89[.]crudeschem[.]com/_backpoint?_webpack | URI | Other campaign phishing URI |
Bart is a senior manager at NVISO where he mainly focuses on Threat Intelligence, Incident Response and Malware Analysis. As an experienced consumer, curator and creator of Threat Intelligence, Bart loves to and has written many TI reports on multiple levels such as strategic and operational across a wide variety of sectors and geographies. Twitter: @bartblaze