Starting with JEB 4.2, users have the ability to instruct dexdec1 to load external Intermediate Representation (IR) optimizer plugins. 2
From a very high-level perspective, a Dex method scheduled for decompilation goes through the following processing pipeline:
Phase 3 consists of repeatedly calling IR processors, that essentially take an input IR and transform it into another, further refined IR (that process is called “lifting”). IR processors range from junk code cleaner, to variable propagation, immediate propagation, constant folding, higher-level construct rebuilding, compound predicate rebuilding, code restructuring, to all sort of obfuscation removal, advanced optimizers that may involve emulation, dynamic or symbolic execution, etc.
By working at this level, power-users have the ability to write custom deobfuscators, that we may not be able to deliver as JEB built-ins for a variety of reasons (e.g. obfuscation specific to a single group of files, custom protection to files under NDA, etc.).
dexdec IR plugins are JEB back-end plugins (not front-end scripts). Therefore, they are to be dropped in the coreplugins
folder (or coreplugins/scripts
for plugin scripts). They can be written as:
In this blog, we will show how to write a Python plugin script. Users familiar with JEB client scripting will be in familiar territory.
IMPORTANT! Note that loading such plugins is not enabled by default in JEB. Add the following line to your bin/jeb-engines.cfg
file to enable loading Python plugins: .LoadPythonPlugins = true
dexdec ir plugins must implement the IDOptimizer
interface. In practice, it is highly recommended to extend the implementing class AbstractDOptimizer
, like this:
from com.pnfsoftware.jeb.core.units.code.android.ir import AbstractDOptimizer # sample IR plugin, does nothing but log the IR CFG class DOptSamplePython(AbstractDOptimizer): # perform() returns the number of optimizations performed def perform(self): self.logger.info('MARKER - Input IR-CFG: %s', self.cfg) return 0
IMPORTANT! All dexdec IR public interfaces and types are located in the com.pnfsoftware.jeb.core.units.code.android.ir package. Keep a tab opened on this page while you develop IR plugins!
The skeleton above:
If you haven’t done so, start JEB. Your plugin should appear in the list of dexdec plugins. Check the Android menu, Decompiler Plugins handler:
Now load a dex/apk, and decompile any class. Your plugin will eventually be called. The logger view should attest to that by displaying multiple “MARKER – Input IR-CFG: …” lines.
dexdec‘s IR consists of IDElement
objects. Every IR statement is an IDInstruction
, itself an IDElement. (All those types and their attributes are described in depth in the API doc.) When an IR plugin is called, it “receives” an IDMethodContext
(representing a decompiled method), stored in the optimizer’s ctx public field. The IR CFG, a control flow graph consisting of IR statements, can be retrieved via ctx.getCfg()
. It is also stored in the cfg public field, for convenience. A formatted IR CFG may look like this:
0000/2+ !onCreate(v4<com.pnfsoftware.raasta.AppHelp>, v5<android.os.Bundle>)<void> 0002/2: !requestWindowFeature(v4<com.pnfsoftware.raasta.AppHelp>, 1)<boolean> 0004/3: !setContentView(v4<com.pnfsoftware.raasta.AppHelp>, 7F030000)<void> 0007/5: !x4<android.webkit.WebView> = ((android.webkit.WebView)findViewById(v4<com.pnfsoftware.raasta.AppHelp>, 7F070000)<android.view.View>)<android.webkit.WebView> 000C/2: !loadData(x4<android.webkit.WebView>, getString(v4<com.pnfsoftware.raasta.AppHelp>, 7F05005B)<java.lang.String>, "text/html", "utf-8")<void> 000E/3: !setBackgroundColor(x4<android.webkit.WebView>, 0)<void> 0011/1: !setDefaultTextEncodingName(getSettings(x4<android.webkit.WebView>)<android.webkit.WebSettings>, "utf-8")<void> 0012/1: return
Statements can have any of the following opcodes (see DOpcodeType
): IR_NOP, IR_ASSIGN, IR_INVOKE, IR_JUMP, IR_JCOND, IR_SWITCH, IR_RETURN, IR_THROW, IR_STORE_EXCEPTION, IR_MONITOR_ENTER, IR_MONITOREXIT.
Statement operands are themselves IElement
s, usually IDExpression
s. Examples: IDImm
(immediate values), IDVar
(variables), IDOperation
(arithmetic/bitwise/cast operations), IDInvokeInfo
(method invocation details), IDArrayElt
(representing array elements), IDField
(representing static or instance fields), etc. Refer to the hierarchy of IDElement
for a complete list.
IR statements can be seen as recursive IR expression trees. They can be easily explored (visitXxx
method()) and manipulated. They can be replaced by newly-created elements (see IDMethodContext.createXxx
methods). Data-flow analysis can be performed on IR CFG, to retrieve use-def and def-use chains, and other variable liveness and reachability information (see cfg.doDataFlowAnalysis
).
Let’s put this new API to practical, real-world use. First, some background: JEB ships with emulator-backed IR optimizers that attempt to auto-decrypt immediates such as strings. While this deobfuscator generally performs well on DexGuard-protected files, lately, we’ve received samples for which strings were not decrypted. The reason is quite straight-forward, see this example:
throw new java.lang.IllegalStateException(o.isUserRecoverableError.read(((char)android.text.TextUtils.getOffsetBefore("", 0)), 12 - java.lang.Long.compare(android.os.Process.getElapsedCpuTime(), 0L), (android.view.ViewConfiguration.getFadingEdgeLength() >> 16) + 798).intern());
In the above code (extracted from a protected method), read is a string decryptor. Alas, the presence of calls such as:
prevent the generic decryptor from kicking in. Indeed, what would an emulator be supposed to make with those calls to external APIs, whose result is likely to be context-dependent? In practice though, they could be resolved by some ad-hoc optimizations:
We will craft the following IR optimizer: (file DGReplaceApiCalls.py)
from com.pnfsoftware.jeb.core.units.code.android.ir import AbstractDOptimizer, IDVisitor class DGReplaceApiCalls(AbstractDOptimizer): # note that we extend AbstractDOptimizer for convenience, instead of implementing IDOptimizer from scratch def perform(self): # create our instruction visitor vis = AndroidUtilityVisitor(self.ctx) # visit all the instructions of the IR CFG for insn in self.cfg.instructions(): insn.visitInstruction(vis) # return the count of replacements return vis.cnt class AndroidUtilityVisitor(IDVisitor): def __init__(self, ctx): self.ctx = ctx self.cnt = 0 def process(self, e, parent, results): repl = None if e.isCallInfo(): sig = e.getMethodSignature() # TextUtils.getOffsetBefore("", 0) if sig == 'Landroid/text/TextUtils;->getOffsetBefore(Ljava/lang/CharSequence;I)I' and e.getArgument(0).isImm() and e.getArgument(1).isImm(): buf = e.getArgument(0).getStringValue(self.ctx.getGlobalContext()) val = e.getArgument(1).toLong() if buf == '' and val == 0: repl = self.ctx.getGlobalContext().createInt(0) # Long.compare(xxx, 0) elif sig == 'Ljava/lang/Long;->compare(JJ)I' and e.getArgument(1).isImm() and e.getArgument(1).asImm().isZeroEquivalent(): val0 = None arg0 = e.getArgument(0) if arg0.isCallInfo(): sig2 = arg0.getMethodSignature() if sig2 == 'Landroid/os/Process;->getElapsedCpuTime()J': # elapsed time always >0, value does not matter since we are comparing against 0 val0 = 1 if val0 != None: if val0 > 0: r = 1 elif val0 < 0: r = -1 else: r = 0 repl = self.ctx.getGlobalContext().createInt(r) # ViewConfiguration.getFadingEdgeLength() elif sig == 'Landroid/view/ViewConfiguration;->getFadingEdgeLength()I': # always a small positive integer, normally set to FADING_EDGE_LENGTH (12) repl = self.ctx.getGlobalContext().createInt(12) if repl != None and parent.replaceSubExpression(e, repl): # success (this visitor is pre-order, we need to report the replaced node) results.setReplacedNode(repl) self.cnt += 1
What does this code do:
– First, it enumerates and visits all CFG instructions.
– The visitor checks for IDCallInfo
IR expressions matching the kinds of Android framework API calls described above: getOffsetBefore(), compare(getElapsedCpuTime(), 0), getFadingEdgeLength()
– It evaluates and calculates the results, and replaces IR call expressions (IDInvokeInfo
) by newly-created constants (IDImm
).
The resulting IR, which the plugin could print, would look like:
throw new java.lang.IllegalStateException(o.isUserRecoverableError.read(((char)0, 12 - 1, 0 + 798).intern());
Subsequently, other optimizers, built into dexdec, can kick in, clean the code further (e.g. fold constants), and make the read() invocation a candidate for string auto-decryption, yielding the following result:
Done!
The DGReplaceApiCalls.py script can be found in your coreplugins/scripts
folder. Feel free to extend it further. It appears that recent versions of DexGuard makes extensive use of these tricks to thwart auto-deobfuscators.
coreplugins/scripts
. As a plugin grows in size and complexity, working with a strongly-typed language like Java, coupled with excellent javadoc integration in IDE, becomes extremely invaluable.With this option disabled, when your caret is positioned on a method, issuing a decompilation request will only decompile the target method, and nothing else (not even inner classes/methods of the target will be decompiled.)
IJavaMethod
— not the usual IJavaClass
. Fully-qualified names are used to represent types, since import statements are not specified. An added value to the views associated with such units lies in the “IR-CFG” fragment, representing the final (most refined) IR before the AST generation phase kicked in:DUtil
class. Generally, explore the ir/ package’s javadoc, you will find plenty useful information in there.That’s it for now. We’ll publish more about this in the summer. Have fun crafting your own IR plugins. As usual, reach us on Twitter’s @jebdec, Slack’s jebdecompiler, or privately over email. Until next time! – Nicolas