To obfuscate a code means making it difficult to understand for the reader. It is a common practice in programming used to protect intellectual property such as the source code. The main goal of code obfuscation is to make reverse engineering as difficult as possible for the opposing side. Any insight into the application logic by an unauthorized party poses a security threat.
Security threats include a variety of harmful actions ranging from code tampering and vulnerability exploitation to extracted data which are common elements of a reverse engineering attack. Reverse engineering enables hackers to mimic the look and feel of the application, repackage them, and place them on third-party app stores, which, if downloaded, endangers the unsuspecting end users.
Code obfuscation is the process of altering the initial code in a way that can't be interpreted by a hacker, while the code remains fully functional. For a layered approach and enhanced security, use several different code obfuscation techniques on top of each other.
Unfortunately, hackers also use code obfuscation to bypass antivirus protection and gain unauthorized access. Just as an honest mobile application developer would use code obfuscation to heighten the security levels of their app, hackers use it to protect their malware and evade detection.
Partial or complete encryption of a code is considered an obfuscation method. Removing metadata from code that provides additional context and explains the application logic is also used as one of the obfuscation approaches. Changing class and variable names into random labels and stuffing the application script with a piece of meaningless code is another way of obfuscating code. Applying any sort of technique that makes the code difficult to read is considered code obfuscation.
By applying multiple code obfuscation techniques you're heightening the security levels of the application and preventing reverse engineering attacks.
Using redundant logic and meaningless pieces of code; that don't bring any functionality to the application; distracts the reader and makes it difficult to determine which parts of the code are vital to their cause – a reverse engineering attack.
To get more sense of what code obfuscation techniques look like in practice below is an example that showcases how a few simple lines of code can be tricky to understand if code obfuscation is present. The obfuscated code is extremely hard, if not impossible, to follow with the human eye
The rename code obfuscation involves altering the methods and names of the variables within the code. The characters used are usually notations and numbers, the names are confusing in order to distract the reader, and in some cases, characters can be invisible or unprintable. Although altered, the code performs the same functions as if there is no obfuscation applied. The rename obfuscation technique is mostly present in .NET, Java, iOS, and Android obfuscators.
The packing method compresses the entire application to make the obfuscated code unreadable to the uninvited guest.
The instruction pattern/flow transformation technique alters common instructions made by the compiler into less common, more complex instructions that perform the same functions.
Dummy code's role is to prevent reverse engineering attacks by filling up the script with unnecessary code that does not affect the execution of the application. This approach is effective at distracting both the human eye and compilers used to gain insight into the application logic.
This code obfuscation technique aims to remove any additional context that provides more information about the application logic. Elements such as debug information and metadata can be of extreme value to an attacker. Removing them is a common practice in heightening application security.
Opaque predicate insertion involves including a piece of code that is potentially incorrect but won't execute any functions. The role of an opaque predicate is to puzzle the reader with additional statements, usually or/if-then conditional branches, that lure the attacker in the wrong direction when trying to figure out the application logic.
Debuggers are tools used for code analysis, line by line, helping developers find issues within the code. Hackers, on the other hand, use debuggers for reverse engineering. Anti-debug tools are helpful in detecting the use of debuggers and preventing potential attacks. Essentially, what happens in case an app is runs on a debugger is the following;
An anti-tamper tool is basically a form of application self-protection mechanism injected into the apps' code. In case tampering detection, the application should react in any of the following actions:
The string encryption code obfuscation technique hides the strings in a managed executable. Without the use of obfuscation, the strings are readable. However, in case of string encryption , the original value of the strings displays only when necessary – at runtime.
Control flow is one of the code obfuscation techniques that pays off the most. To protect your code, you introduce arbitrary statements and dead code, which in the end, won't execute. The decompiled code would take on the look of spaghetti logic, making the code extremely difficult to understand for the malicious party. However, it is important to note that implementing code flow obfuscation often results in affecting the runtime performance of the method that it applies to.
This code obfuscation technique randomly shuffles routines and branches within the code without affecting code execution. It is popular among malware writers in order to avoid antivirus detection.
Code virtualization, or virtualization obfuscation, is a code obfuscation method protecting software from malicious code analysis. It replaces the code in a binary with a bytecode that is semantically equivalent. The bytecode can be interpreted exclusively by a virtual machine. This makes the revealing of the final code a tiring job for the attacker.
Arithmetic obfuscation takes simple arithmetic and logical pieces of code and replaces them with more complex equivalents.
As the name states, this code obfuscation technique encodes strings with a custom algorithm that enables a decoder function to retrieve the original code.
To determine the quality and success of code obfuscation, take into consideration the following criteria:
This factor reveals how much the original differs from the obfuscated version of the code. Differentiation index is usually determined by observing how many predicates are inserted in the obfuscated code or by examining the depth of the inheritance tree. The higher the DIT, the better.
The best way to determine the strength of implemented obfuscation efforts is to apply automated deobfuscation techniques. The more resources, time, and effort it takes to reverse the code back into its original state, the stronger the obfuscation.
To determine the cost of your obfuscation efforts, you'd compare the time and resources used to execute the obfuscated code with the one used to execute the original version of the code. The best obfuscation results are usually not the most expensive ones but the ones that require a rational amount of resources for an adequate amount of protection.
Reflecting back on the variety of code obfuscation methods mentioned in the previous sections, it makes a lot of sense to combine a couple of them. This layered approach heightens the complexity and quality of implemented obfuscation efforts.
The best-obfuscated code is the one that looks as if there is no obfuscation at all. However, the obfuscated version should be indistinguishable from the original. The attacker faces methods and logic that are hard to follow, hindering their reverse engineering progress.
Although the content of the code is different, all of the functions included in the mobile application's original source code remain the same. The look and feel of the application are intact, backed up by heightened security measures.
The main goal of code obfuscation is to prevent attacks such as reverse engineering. Obfuscation application makes the code unreadable, demotivates hackers from advancing in their malicious attempts, and protects the mobile application from the inside by alerting the app's stakeholders about a potential security threat.
One of the most common reasons behind reverse engineering is copying the stolen code and packing it into a mimic app in order to publish it on third-party app stores. The unsuspecting end users download those imitation apps and place their personal and financial information in the wrong hands. Code obfuscation is a successful tool for preventing such attempts.
Some code obfuscation techniques are based on decluttering metadata that is not useful or code that doesn't perform any functions. This can result in a lighter application and quicker code execution.
Organizations and application stakeholders who want to safeguard their code from attackers and competitors should consider code obfuscation as a means of intellectual property protection. The attackers are no strangers to blackmailing industry giants in order to keep the information regarding security loopholes a secret.
Depending on the techniques used, changes in code performance will vary from 0 to 80%. If we examine the effect of rename obfuscation in terms of the final performance, the change is hardly noticeable. That is the case because only the names of methods, variables, and class are altered. However, control flow obfuscation can prove to have a significant impact on code performance. The addition of meaningless code, performing no functions, adds weight and slows down the execution.
As discussed in the section concerning the quality of obfuscation, it is important to weigh the effort and output of code obfuscation. Not all techniques work well with different code. The goal is to find the best tradeoff between the added security layers and the potential impact on code performance.
In case you're curious, feel free to contact us - zero obligation. Our ASEE team will be happy to hear you out.