With a slew of decompilation improvements, Binary Ninja 3.5 (Coruscant) has completed its jump from hyperspace dev with even more improvements to the decompilation quality and many other quality of life improvements across the UI, API, documentation, debugger, and more! Here’s a list of the biggest changes, but don’t forget to check out the full list of changes with even more fixes and features.
One of the many things compilers do that can make reverse engineering harder is use a variety of algorithmic optimizations, in particular for modulus and division calculations. Instead of implementing them with the native CPU instructions, they will use shifts and multiplications with magic constants that when operating on a fixed integer size has the same effect as a native division instruction.
There are several ways to try to recover the original division which is far more intuitive and easer to reason about. One example as demonstrated by the Division Deoptimization plugin is to use a solver such as z3 to try to recover the simplification. Unfortunately the performance of this implementation at scale and across compilers and architectures is not only slow, but it actually fails at least as often as simpler heuristics.
Finally, there are several heuristics based on detecting particular patterns (asm or IL based) that can be used to identify patterns. This is the route we’ve chosen to go for our division/modulus deoptimization feature. Like most heuristic approaches, this is not a foolproof mechanism! But no other tool is perfectly accurate either. For example, all tested tools fail to consistently reconstruct 16-bit division with small immediate divisors when compiled for x64 using clang:
IDA:
__int64 __fastcall i16_div_by_103(int a1)
{
return ((unsigned int)(20361 * a1) >> 31) + ((20361 * a1) >> 21);
}
Ghidra:
int _i16_div_by_103(int param_1)
{
return (param_1 * 0x4f89 >> 0x15) - (param_1 * 0x4f89 >> 0x1f);
}
Binary Ninja:
uint64_t _i16_div_by_103(int32_t arg1)
int32_t rax = arg1 * 0x4f89
return zx.q((rax s>> 0x15) + (rax u>> 0x1f))
All existing tools exhibit some gaps in their coverage and there are additional concerns such as how to handle division and modulus calculations on operands that are powers of two. Those can often be expressed as bit shift or mask operations and are indistinguishable from intentional shift/mask operations which are functionally identical.
The wrong implementation can also produce incorrect results. Consider the case of division by 7 for example. At extreme values, an overly simplistic heuristic can incorrectly detect an optimized division that will fail at the extremes due to fractional rounding errors, and conversely can incorrectly detect the actual optimized implementation.
Given the following two similar implementations:
>>> def div_by_seven(input):
... x8 = (input * 0x2492492492492493) >> 0x40
... x9 = input - x8
... x8_1 = x8 + (x9 >> 1)
... return x8_1 >> 2
>>> def not_div_by_seven(input):
... return (input * 0x2492492492492493) >> 0x40
...
>>> not_div_by_seven(2**64-10)
0x2492492492492491
>>> div_by_seven(2**64-10)
0x2492492492492490
Ghidra’s Decompilation will be:
ulong _u64_div_7(ulong param_1)
{
return param_1 / 7 + (param_1 - param_1 / 7 >> 1) >> 2;
}
ulong _u64_not_really_div_7(ulong param_1)
{
return param_1 / 7;
}
Binary Ninja’s decompilation:
uint64_t _u64_div_7(int64_t arg1) __pure
{
return (arg1 / 7);
}
uint64_t _u64_not_really_div_7(int64_t arg1) __pure
{
return ((int64_t)((arg1 * 0x2492492492492493) >> 0x40));
}
That said, Binary Ninja’s implementation is by no means complete. In particular, armv7 support is currently lacking due to a lifting issue (which we will resolve shortly after the stable release on dev), while the armv8, x86, and x64 across all possible integer sizes and compilers average about 55% across all possible exhaustive deoptimizations. If you’re curious, we tested a matrix of all integer sizes from 8, 16, 32, and 64 both signed and unsigned for both div and mod on all four of the above architectures using clang, gcc, and msvc.
If you’re working on armv7 and this feature is important, consider switching to our development release channel and you should get the deoptimization improvements shortly after the stable release!
One easy way to improve decompilation output is to come up with better default names for variables. There’s a lot of possible defaults you could choose and a number of different strategies are seen throughout different reverse engineering tools. Prior to 3.5, Binary Ninja left variables named based on their origin. Stack variables were var_OFFSET
, register-based variables were reg_COUNTER
, and global data variables were (data_
). While this scheme isn’t changing, we’re being much more intelligent about situations where additional information is available.
For example, if a variable is passed to a function and a variable name is available, we can now make a much better guess for the variable name. This is most obvious in binaries with type libraries. Here’s a sample EXE file showing this new default naming:
This isn’t the only style of default names. Binary Ninja also will name loop counters with simpler names like i
, or j
, k
, etc (in the case of nested loops):
3.5 contains brand new UEFI support. Not only has a new EFI platform been added with support for some built-in platform types and automatic recognition of UEFI firmware files, but we’ve also released an official EFI Resolver plugin that resolves many GUID Protocols to add additional symbols and type information.
As we often do for supplemental analysis in plugins, we’ve released the EFI Resolver plugin under a permissive license and PRs or issues are welcome!
Another straightforward but valuable improvement to analysis has come in the form of Heuristic Pointer Analysis. Prior to 3.5, pointers were created any time a use of a pointer was observed in code, but there are a number of situations (such as virtual function pointers) where a heuristic that identifies and creates pointers can improve future analysis (such as RTTI recovery).
The current implementation is enabled by default behind the analysis.pointerSweep.autorun setting, and can also be manually triggered via the Analysis
/ Run Pointer Sweep
menu, or the Run Pointer Sweep
action in the command palette.
Conditions for detecting pointers include:
The issue associated with this improvement is titled “improved stack structure aliasing”, however the end impact for most users will be more accurate dead code elimination. Why is that? First, let’s talk about stack variable aliasing. Aliasing occurs any time memory can potentially be accessed through different addressing mechanisms. Because a struct base pointer can be used to access the members at fixed offsets, stack structures are often modified in ways that are hard for program analysis to track without careful alias analysis. Thus it is often better to make conservative assumptions about potentially aliased stack locations, even at the expense of missing decompilation optimization opportunities.
Prior to 3.5, Binary Ninja was fairly aggressive with assuming constant values on the stack. If it saw only a single reference writing a value, and no other writes, it would assume that constant value during later analysis. However, if the value was part of a structure and was changed by a different reference (in this case, the base pointer plus an offset), this would mean that the decompiler’s assumptions about the value being constant were incorrect.
This issue combined with dead code elimination (where constant values cause conditionals to be removed when the value is known) meant that Binary Ninja was removing live pieces of code that shouldn’t be removed. The newly improved analysis better accounts for variable aliasing, allowing Binary Ninja to be less inappropriately aggressive about assuming values are constant.
Of course, the downside to this change is that there are times where you DO want the values to be treated as constant for dataflow/analysis purposes. Thankfully, this is easy to do. You can either mark the variable as const
or use UIDF to set the value so that constant data flow propagation can continue, potentially enabling other decompilation optimizations.
Prior to #3903 being resolved, the first time you loaded a binary in Binary Ninja, you could experience up to a few seconds of delay as all type libraries were de-serialized even if they weren’t relevant for the file type you were opening. However, as of this release, that is no longer the case. Now, type-libraries are de-serialized on demand resulting in much faster load times!
Users most likely to be impacted by this performance improvement will be those batch-processing files where they are also using one binary per executable (which is actually recommended behavior for other reasons) though of course all users will see a very small improvement in initial load times.
This was additionally noticeable in situations like dogbolt.org where the slower de-serialization was disproportionally slowing down our decompilation compared to other tools. By the time you’re reading this blog, the new version should be live (there’s actually a sneak “refresh” icon that’s disabled by default that you can use the web developer console to enable to reload decompilation with the latest version if you’re looking at results from an older version of one of the tools)
TTD (Time-Travel Debugging) is a technology that records an execution trace of a program or system and then replays it during debugging. We have integrated WinDbg’s TTD capacity into our debugger, which allows you to replay traces recorded by WinDbg, within our debugger user interface. Of course, this capability is only possible when debugging Windows binaries at this time though future support for other backends capable of TTD is under consideration.
TTD brings unparalleled advantages over traditional debugging. It can save you a lot of time since you no longer need to restart the target repeatedly. It also makes it easier to stop the target at certain point. Say if a loop is executed 100 times, and you want to break the target at the last iteration. This is not very easy to do in traditional debugging, but it is trivial in TTD: just place a breakpoint right after the exit of the loop, and reverse execute a few instructions. TTD is also helpful for malware analysts since, for example, the C&C server could reject the requests if someone tries to download the payload multiple times.
To get started, follow the setup instructions and explore the potential of TTD!
You might remember that MH_FILESET support was in the experimental section of the 3.4 release, and we’re pleased to announced it has graduated and it is enabled by default! Any time you open a new MachO file containing MH_FILESET records you won’t have to do anything to take advantage of this new feature.
Experimental features are those that are shipped in the product but either disabled by default, or have known limitations that prevent us from advertising them as fully ready. We encourage testing and please let us know if you find any issues with this features either via GitHub, or slack, or more.
Despite some improvements and fixes to the components UI, we’re still leaving it in the experimental category for one more stable release. We expect that soon after the 3.5 release however we will be enabling it by default for the upcoming 4.0 release.
Our previous official DWARF Import plugin has been deprecated in favor of a new built-in plugin. However, it is currently still experimental and is disabled by default. Additional testing is welcome, simply go to settings and search for “dwarf” to enable the Import Plugin:
Once enabled, ELF and MachO files with embedded DWARF will automatically be parsed when the file is loaded.
While enabled by default, the DWARF export plugin is still considered experimental. However, since it is triggered manually via the “Export as DWARF” action, or the Plugins / Export as DWARF menu item, the plugin is enabled by default.
Exporting as DWARF allows you to create an external file with data such as symbols and types that could be associated with the primary executable in another tool that can’t parse Binary Ninja’s database format. Currently exported features include:
During our standard release process testing we identified a memory leak introduced during the last development cycle. As we tested that fix we realized there were a number of other latent leaks that had evaded previous testing and we spent significant time hunting and eliminating them these past few weeks. Many use cases won’t be effected if you tended to reverse engineer one binary at a time and restart between sessions. However, if you were previously opening/closing many binaries in the same session it’s possible you’ll see significant memory reduction as a result of these improvements! Several examples of previous leaks include leftover BinaryView references being retained as a result of certain rebasing operations, certain debugger operations, and others.
Special thanks to (in no particular order): jprokos26, mmaekr, tanakalian, meithecatte, JJTech0130, toolCHAINZ, emesare, alexrp, unknowntrojan, mkrasnitski, amtal, SmoothHacker, yrp604 and ek0 for your contributions to our open source components during this release cycle!
One common problem with scrolling through large flow graphs is that, when scrolling out to see the structure of the graph, it becomes difficult to see the control flow. While looking at the nodes from this perspective is helpful for mentally organizing the graph into chunks, early returns can introduce invisible control flow since you cannot easily see which blocks have outgoing arrows. Now, entry and exit nodes will be marked with an indicator visible from any zoom level, showing where control flow happens, both in to and out of the function. As well, blocks that cannot return, such as calls to abort()
, have a distinct color from normal returns. These indicators should help with scanning larger control flow graphs while doing analysis, and hopefully improve your speed and accuracy when gaining first-pass overview knowledge of a function.
A common feature in many text editors, rainbow braces are now supported in Binary Ninja as well. This should help with scanning for matching sets of parentheses in lines with complex conditions, as well as marking indentation levels for deeper-nested blocks.
Along with rainbow braces, now you can highlight a brace (or bracket/parenthesis) and see its matching closing brace. This should improve your ability to analyze complex expressions by showing you the grouping of operations and instructions.
ui.view.common.commentWidth
) for line-wrapping commentsN
hotkey can be used to name an enumeration member at a useRestart with Plugins Disabled
action created for easier troubleshootingAdd Segment
dialog has proper buttons and is no longer resizableshow_html_report
__pure
analysis.limits.maxFunctionSize
is consistent and 0
causes it to be ignored, not for analysis to skip all functions (if previously using this to skip analysis, use the analysis.initialAnalysisHold
setting)ReadPointer
API added to BinaryReaderAnalysisContext
API added (thanks JJTech!)BinaryView.undoable_transaction()
open_view
/ load
error messages are much more helpful and the APIs themselves are now more consistent with opening in the UIparse_expression
load
API works correctly with raw bytes__ptr_offset
annotation.bndb
file (it will still show by default if you are uploading a new binary).callees
documentation corrected;
characters0x
could be omitted…and, of course, all of the usual “many miscellaneous crashes and fixes” not explicitly listed here. Check them out in our full milestone list: https://github.com/Vector35/binaryninja-api/milestone/13?closed=1