About a month or two ago, someone asked me to analyze some obfuscated Android malware. Recently, I finally had a chance to take a look. I ended up using dex-oracle along with some tricks to partially deobfuscate it. In this post, I’m going to explain the tricks and the overall process I used. This post will be useful if you deal with a lot of obfuscated Android apps.
The main problem was dex-oracle didn’t work “out of the box”. It took some “hacking” to make it work. Specifically, I modified an existing deobfuscation plugin to create two new plugins as well as slightly modify the app. It’s really hard to make completely generalized deobfuscation tools, or any kind of advanced tool, so you’ll need to know how it works in order to modify it to suit your needs.
The Sample
Here’s the SHA256:
|
|
High-Level Analysis
I like to start with a decompilation just to get a high level overview of the package structure. Here’s what the class list:
Some class names have been ProGuard’ed (a
, b
, c
, etc.) but some haven’t (Ceacbcbf
). These unobfuscated classes are probably Android components (activity, service, broadcast receiver, etc.) which must be declared in the manifest. Thus, any tool which automatically renames them would also have to rename them in the manifest, which is hard. These may have been manually changed. The obfuscation is probably home-made and partially done by hand. This means it’s probably malicious because a legit developer would probably pull a commercial obfuscator off the shelf and just use that. They wouldn’t waste time changing their class names to something indecipherable like Aeabffdccdac
.
The code is obfuscated. Below is a class which shows the obfuscation:
You can’t see any strings or class names, which is really annoying. This looks like something Simplify can handle, but, spoilers, it fails miserably. That’s fine. I have many tricks up my sleeve. Let’s take a look at the Smali and see if anything jumps out.
String and Class Obfuscation
The first type of obfuscation which jumped out at me was an “indexed string lookup” type obfuscation.
|
|
This pattern is found hundreds of times in the code. It takes a number, passes it to f.a(int)
, and gets a string back. This is some basic “level 1” style encryption. There’s probably a big method somewhere which builds an array of strings that the number indexes into.
A second type of obfuscation hides class constants using an identical technique:
|
|
This code passes a number to g.c(int)
and gets back a class object (const-class
).
You may be thinking you’ll have to reverse engineer the lookup methods, and you’d be wrong. It’s cool and all to deep dive into the complex code and completely master it by writing a decryption routine. But honestly, fuck that. Speed is the name of the game, and I really don’t have time to fuck around with this malware author’s bullshit, retarded, home-made, amateur hour obfuscation. Instead of reversing everything, consider that these “lookup” methods are both static. It should be possible to just execute them with the same inputs from the code to get back the decrypted output. For example, in the case of string decryption, I should be able to execute f.a(0x320fb26f)
and get back the decrypted string.
The question is, of course, how do you execute just the target method code? It’s an APK. How can you execute just the method you want with the inputs you want? How do you harness the target methods? There are two paths you can go by:
- Convert target DEX to a JAR using dex2jar or enjarify. Then, import the JAR into a Java app and call the decryption code from your Java app.
- Create a stub / driver app which takes command line arguments and can reflect methods in a DEX file. Then, execute the driver app + target DEX on an emulator.
As it happens, I’ve already created dex-oracle which does #2. I like #2 more than #1 because it doesn’t rely on decompilers which often introduce subtle logic bugs. However, I’ve used #1 a few times in a pinch, so it’s worth mentioning. I went about adding support for this type of obfuscation to dex-oracle. the plugins were added in Add indexed string + class lookups.
The way dex-oracle works is pretty simple. It contains a collection of plugins which define regular expressions which pull out key bits of information – method calls and arguments. Then, it constructs real method calls with the arguments you pull out and passes them to a driver which executes the original DEX file on an emulator. Finally, the plugin defines how the driver output should be used to modify the method.
For example, the regular expression could look for “a const number, a call to a static method which takes a number and returns a string, and moves the result to a register”. Then, the driver executes that method with the number and returns the decrypt string. Finally, the original string lookup code is replaced with just the decrypted string. You can read more about how it works in TetCon 2016 Android Deobfuscation Presentation.
dex-oracle Before Modification
Unfortunately, even with the new plugins, dex-oracle fails. To keep things simple, I disable all plugins except IndexStringLookup and I only process the d
class from the picture example above.
|
|
The Invalid date/time in zip entry
stuff is just noise. Maybe they tried obfuscating the timestamp in the ZIP? I dunno.
What concerns me is the Unsuccessful status: failure for Error executing 'static java.lang.String xjmurla.gqscntaej.bfdiays.f.a(int)' with 'I:839889519'
. The error tells me there’s a NullPointerException
when it executes f.a(int)
. Looks like every time it tried to call that method, it failed. So, let’s look at f.a(int)
.
|
|
The entire method is pretty small. Just subtracts the first argument from a big constant and uses that as an index into a string array, Lxjmurla/gqscntaej/bfdiays/f;->k:[Ljava/lang/String;
. Well, let’s look out f;->k
is initialized.
|
|
There’s only one sput-object
and it’s in xjmurla/gqscntaej/bfdiays/Ceacabcbf.smali
. By looking for this line in Ceacabcbf
, we find private Ceacabcbf;->a()V
. This is a big, long, complicated method which contains a HUGE string literal which is processed, chunked up, and stored in f;->k
. Hmm, our NullPointerException
is caused by this field not getting initialized. This means that Ceacabcbf;->a()V
is not getting called during execution of the string decryption method. Well, when is it called?
|
|
Ahh, it’s only called in Ceacabcbf
. Let’s find that.
|
|
It’s called in Ceacabcbf;->onCreate()V
. This class is a subclass of Application
. Without looking at the manifest, I’m pretty sure that when the app starts, this component is created, onCreate()V
is called, the decrypted string array is built, and most importantly f;->k
is initialized. Hmm, how can I make it so that dex-oracle calls this method when decrypting strings?
My first thought is to add a method call to Ceacabcbf;->a()V
in f;-><clinit>
. This ensures that when the string decryption class f
is loaded, it initializes the decrypted string array. BUT, a()V
is direct. WHAT TO DO?
Well, this is kind of dumb but it works sometimes. Just create a new public, static method called Ceacabcbf;->init_decrypt()V
and copy the code from Ceacabcbf;->a()V
. Then, add a line to call this method in f;-><clinit>
:
|
|
dex-oracle After Modification
After making some changes which hopefully work, need rebuild the DEX from the modified Smali and try dex-oracle on it.
|
|
No errors. Let’s see the decompilation.
|
|
Oh, hello there Mr. C&C domain! GET REKT BRO.
Ok, but that still leaves the class deobfuscation. That’s still annoying, right? Well, to keep this post short, dex-oracle fails when deobbfuscating classes for the same reason as it originally failed for strings. The same Ceacabcbf;->a()V
method needs to be called.
The same trick can be used – just call Ceacabcbf;->init_decrypt()V
in g;-><clinit>
. However, g
doesn’t have a <clinit>
so you’ll have to add one:
|
|
Now, rebuild and let dex-oracle do it’s thing:
|
|
Let’s see if the decompilation looks any different.
|
|
There’s not much difference for this method, but other methods have a lot more information, especially in the Smali where you can see lots of const-class
es. There’s still one call to g.c(int)
which isn’t deobfuscated. I found out that this is because the method call succeeds but returns null
. Maybe that’s why it’s in a try-catch? Maybe it’s trying to load a class which doesn’t exist on every Android API version?
One final test: run it against the entire DEX file.
|
|
It worked. Cool. Now there are lots of strings! This should also make it a lot easier for Simplify to work because there’s less code to execute and fewer places to fail.
Summary
Hopefully after reading this you have better idea of how to bend dex-oracle to suit your needs. It’s pretty flexible and great when you can isolate the code you need to run to a single method. Sometimes you need to make changes to an Android app to help dex-oracle, but modifying Smali is relatively easy to modify and a lot of malware doesn’t bother doing anti-tampering checks.