Sidekick 2.0 includes a set of powerful features that can help you accomplish a variety of tasks. Today, we will be applying several of them to the task of de-obfuscating strings in a malware sample called Amadey.
Amadey, as explained in its Malpedia entry, is a botnet that periodically sends information about the system and installed AV software to its C2 server and polls to receive orders from it. Its main functionality is that it can load other payloads (called “tasks”) for all or specifically targeted computers compromised by the malware. This particular malware sample employs an obfuscation technique that stores the strings referenced by the binary as encrypted strings that are then decrypted during runtime. This makes it more difficult for analysts to reverse engineer and understand what the malware is doing and also prevents anti-virus software from identifying it.
Big thanks to Josh Reynolds from InvokeRE for giving us the Amadey sample and working with our team to improve Sidekick’s malware analysis while we were working on this post.
Let’s get started.
First, we need to find the function in the binary that decrypts the encrypted strings. (Note: This sample binary is stripped of any symbol information that would ordinarily assist us in this task.) Since this is a task that requires both iterating over the code in the binary and analyzing its contents, the automation and reasoning capabilities of the Analysis Workbench makes it most suitable for this task.
Within the Analysis Workbench, we enter a simple task description of “Find functions that perform decryption on strings”:
Sidekick automatically generates the script that will perform that task. You will notice a few things in this script:
LLMOperator
to determine if a given function performs decryption on strings (line 2). When invoked, the LLMOperator
will use a large language model to determine if the function passed to it (on line 12) performs decryption on strings.Since the “Description” metadata in the index entry doesn’t provide us with any additional information, let’s ask the Sidekick Coding Assistant to give us an actual description of the decryption routine:
Next, let’s run the script and see what we get in the output Decryption Functions
index:
Within a few seconds, we already have a few entries to look at. The function sub_401290
seems promising, so let’s take a peek:
Yep. This is the decryption function we’re looking for.
Let’s see if we can convert it to Python so that we can easily run it within an Analysis Workbench script. The Sidekick Assistant is particularly suitable for this kind of task given its focus on smaller sets of function, so let’s ask the Assistant to convert it to Python for us:
That was easy!
Now that we have identified the decryption function and converted it to Python, let’s actually use it to decrypt all the strings passed to it. Since we are searching for multiple locations across the binary, calling custom Python code on data at those locations, and outputting those results for us to review, the Analysis Workbench is the perfect tool for this job. So let’s write another script.
In the Analysis Workbench, we will create a new, empty script by clicking on the hamburger menu and selecting New Script
.
We’ll give our script a title, paste in our Python decryption function, give it a new name, and ask the Sidekick Coding Assistant to use the given Python decryption function to decrypt all the strings passed to the sub_401290
function.
Let’s see what we get.
After running this revision of the updated script, we do not get any results. This happens because the strings passed to sub_401290
do not have string variables defined for them in the binary, so bv.get_string_at
will not return anything. The reason that Binary Ninja did not make string variables for them is because they are encrypted strings. This is very common in reverse engineering tools when analyzing binaries with encrypted strings since they do not look like real strings. Therefore, we need to create the string by reading its bytes from the binary directly. Instead of trying to do that ourselves, let’s just ask Sidekick to create a function for us that does that. At the same time, let’s also have Sidekick update the script to output both the call instruction and the decrypted string to the index as the strings are decrypted.
After running this script revision, we get this error:
Errors are no fun, so let’s have Sidekick deal with it (since it has access to the content in the Output console):
Now let’s run it!
Success!
Since we output the decrypted strings to an index, we can use Sidekick’s Code Insight Map to view the relationship between functions that reference those decrypted strings and help us understand what the binary is doing with those strings. Let’s also run the High-Level Functions
script (included by default within the Analysis Workbench) to generate an index of high-level functions in this binary.
Let’s scroll down to see a little bit more:
Now that we’ve decrypted the strings referenced in this binary, we can very quickly get an idea of what some of this malware does - a botnet that periodically sends information about the system and installed AV software to its C2 server and polls to receive orders from it.
Using Sidekick 2.0, we were able to de-obfuscate a malware binary and gain a much clearer picture of what it does within minutes.
If you are not already using Sidekick, then sign up today to start making your malware analysis easy.