We're Hiring!

How to Use Code Obfuscation to Hide Secrets in Your Mobile App

binary-code-1

Mobile app security is a crucial aspect that needs to be prioritised by developers and businesses alike. With the increasing number of cyber-attacks targeting mobile apps and their APIs, it's more important than ever to take measures to protect the secrets used by mobile apps to access their APIs to protect users' sensitive data and prevent unauthorised access to it and subsequent data breach.

The purpose of this blog post is to show how code obfuscation can be used as a method  to hide secrets in mobile apps to then discuss its effectiveness and limitations. While these techniques are commonly used by developers to protect their mobile app secrets, they are not foolproof and can be bypassed by determined attackers, effectively becoming  a Maginot Line in their security defences. 

In this blog post we will delve into code obfuscation, explaining what it is, how it works, its advantages and disadvantages, and why it's not enough on its own to hide the API secrets on a mobile app. We will also provide an example of a ChatGPT mobile app that makes use of code obfuscation to hide the API key, and explain how attackers may go around the Maginot Line represented by the use of code obfuscation. Finally, we will discuss better alternatives to protect the mobile app secrets, where mobile app developers will learn about the importance of adopting a comprehensive end to end mobile app security strategy that includes the use of multiple layers of protection for the mobile app and APIs they communicate with.

What is Code Obfuscation?

Code obfuscation is a technique used by developers to make it more difficult for hackers to understand and reverse-engineer the code of a mobile app. The process involves transforming the app's source code into a form that is more difficult to read and understand, while still preserving its functionality.

There are several techniques used in code obfuscation, including renaming variables, functions, and classes to meaningless names, inserting redundant code and dead code, and using code optimizations that change the structure of the code.

One of the advantages of code obfuscation is that it can help to protect against reverse engineering and intellectual property theft, as it makes it difficult for hackers to understand the app's logic and extract sensitive information. However, there are also some disadvantages. For example, code obfuscation can increase the size of the app and impact its performance, and it may also make debugging and maintenance more difficult for developers.

Despite its advantages, code obfuscation alone is not effective to protect the mobile app's secrets because they will be kept in clear text on the mobile app APK, just the variables names holding them will be obfuscated. To put it simply, code obfuscation only obfuscates the code, not the data within it. Attackers can still use a variety of techniques to bypass code obfuscation, such as dynamic analysis, code injection, and memory dumping. Therefore, it is important to implement additional layers of protection to ensure the app's security.

How to use Code Obfuscation to Hide Secrets

After introducing the concept of code obfuscation, we will now demonstrate its practical application through an example that involves hiding the ChatGPT API key for the OpenAI API within a mobile app. The app directly accesses this third-party service, but a more secure approach would be to delegate this access to a backend that the developer manages, thus preventing exposure of the ChatGPT API key within the mobile app.

Mobile app developers have become accustomed to accessing third-party APIs directly from their mobile apps to avoid dealing with a backend. However, this decision brings the challenge of protecting the API keys for such third-party services from being extracted by attackers. The most common approach to address this issue is to use code obfuscation to hide these secrets and other sensitive information from prying eyes.

Will code obfuscation work as mobile developers expect to protect such sensitive secrets? That's what you are about to discover with the ChatGPT mobile app example. 

ChatGPT Mobile App Example

We will use the ChatGPT mobile app example, that you need to git clone to your computer to follow along:

git clone https://github.com/approov/demo-android-chatgpt

The ChatGPT mobile app has an hard-coded API key to be able to contact the OpenAI API:

companion object {
private const val apiKey = "Bearer YOUR_CHATGPT_API_KEY"
private const val apiUrl = "https://api.openai.com/v1/chat/completions"
private const val apiUrlTest = "https://postman-echo.com/post"
}

In this case YOUR_CHATGPT_API_KEY is being hard-coded directly into the mobile app code, which is a bad security practice. A best security practice would be for it to come from the local.properties file, but doesn’t make any difference for the purposes of hiding the API key with code obfuscation in the release binary.

Enabling Code Obfuscation with ProGuard

To apply code obfuscation to the ChatGPT mobile app we will use ProGuard, which is natively integrated in Android Studio. To enable it, the property minifyEnabled needs to be set to true in the build.gradle file, and the proguardFiles loaded in case you want to configure ProGuard:

buildTypes {
    release {
         minifyEnabled true
         proguardFiles getDefaultProguardFile('proguard-android-optimize.txt'), 'proguard-rules.pro'
    }
}

More advanced open-source and commercial tools exist that do a lot more than Code Obfuscation, for example performing string obfuscation, string encryption and code hardening. We will look at those in other blog posts.

Build a release to find if Code Obfuscation works to hide the API KEY 

To be able to check if Code Obfuscation also obfuscates the value “YOUR_CHATGPT_API_KEY” for the apiKey variable when a release is built with ProGuard enabled we will build a simple unsigned APK from the command line.

From your terminal execute:

bash ./gradlew build

When you build a production release the mobile app code will be obfuscated, but not the strings in it, thus YOUR_CHATGPT_API_KEY will be in clear text on your mobile app binary. 

To check that the YOUR_CHATGPT_API_KEY is in clear text on the mobile app binary we will need to decompile it. For that we will use the apktool bundled with Android Studio:

apktool decode ./app/build/outputs/apk/release/app-release.apk --force --output ./decoded-apk

This is as easy to check as running the grep command in a terminal:

grep -irn 'YOUR_CHATGPT_API_KEY' ./decoded-apk                                                                                                                                                        

And the output should be similar to this:

/decoded-apk/smali/io/approov/chatgpt/MainActivity.smali:618:    const-string v2, "Bearer YOUR_CHATGPT_API_KEY"

As you can see YOUR_CHATGPT_API_KEY is not obfuscated and is still kept in clear text in the APK that you will release, thus up for grabs for any  attackers that take the time to reverse engineer the APK. The only thing that was obfuscated was the variable name apiKey that became v2 which isn’t enough to protect the API key from being stolen, just makes it slightly more time consuming to find and extract via static binary analysis.

Obviously, in a real word scenario the attacker doesn’t have access to the mobile app source code, thus he doesn’t know before hand the value YOUR_CHATGPT_API_KEY hardcoded into the variable apiKey, therefore he has a little more work to do to be able to find and extract the API key from the mobile app APK. 

Several approaches that an attacker can take, for example:

  • Case insensitive search in the decoded APK for an occurrence of the word Bearer, that is often used to prefix Authorization tokens.
  • If it fails to find an occurrence of Bearer then the attacker will lookout for the occurrence of common header names used to pass tokens to access APIs, e.g. Authorization, Api-Key, ApiToken, Access-Token, etc.
  • Another approach is to search for the prefix that some API providers add to their tokens. Yes you read correctly, and these providers are big names in the Payments, Cloud services space and AI. You cannot believe how easy they make the attackers' lives with this dumb use of a prefix. For example, the prefix sk-, do you know to who it belongs to? I will let you figure out who uses this prefix for their API keys.
  • When all the previous attempts fail then attackers can resort to search by high entropy strings with open-source or commercial tools that will give a list of probable candidates.

Not all attackers like to spend too much time in static binary analysis, and some may not start with it. Instead, some attackers prefer to start with dynamic analysis by performing runtime attacks, where they can use MitM attack tools and sometimes instrumentation frameworks. The next section will go into that in more detail.

 

Why Code Obfuscation isn’t Enough

Code obfuscation can be a useful technique to protect mobile app source code and the related intellectual property but is ineffective to safeguard sensitive data from attackers, and if used only with this intent then the developer just achieves a false sense of security, a Maginot Line. Here are some reasons why:

Weaknesses in obfuscation algorithms

The effectiveness of obfuscation techniques depends on the algorithms used to implement them. Some algorithms may be weaker than others, and attackers can exploit weaknesses in the algorithm to reverse the code obfuscation.

Reverse engineering 

Screenshot from MobSF dashboard with the results

While code obfuscation can make it difficult for attackers to read the code and find which variables hold the secrets, it doesn’t prevent them from extracting secrets by using reverse engineering techniques. Attackers can use sophisticated tools to reverse engineer the obfuscated code and reveal secrets and code. A tool that I personally like to use is the Mobile Security Framework (MobSF) as seen in the article I wrote: How to Extract an API Key from a Mobile App by Static Binary Analysis.

MitM Attack at Runtime

MitM attack diagram

No one likes to waste time when faster and easier options exist, thus an attacker's preferred approach may be to just install the mobile app in a device he controls and extract the secret with a MitM attack. You can see by yourself how easy it is to do a MitM attack to this same ChatGPT mobile app:  How to use a MitM attack to bypass code obfuscation to extract secrets from the ChatGPT mobile app

Dynamic analysis at runtime with instrumentation frameworks

Screenshot from the Frida code snippets web page.

Code obfuscation is implemented at compile time, but at runtime, the code and data are in clear text and can be analysed. Attackers can use tools to monitor, modify or completely override the app's behaviour at runtime with the intent to bypass protections to then be able to extract the sensitive information, like the ChatGPT API key. In the article I wrote on How to Bypass Certificate Pinning with Frida on an Android App you can see how even you can do it.

What are the Code Obfuscation Advantages?

Obfuscation protects intellectual property against theft

Code obfuscation has been considered an effective way to make it more challenging for attackers to reverse engineer code and steal intellectual property. However, it is important to recognize that with the latest advancements in AI for software development, such as ChatGPT models, this advantage may not be sustainable for much longer.

Obfuscation Increases Difficulty of App Modification for Repackaging

Mobile apps can be vulnerable to being modified and repackaged for distribution by attackers in the official or alternative Android stores. Attackers may have different intentions, ranging from injecting malicious code to adding ads where the revenue goes to the attacker, among other motivations. If the attacker is low skilled, then code obfuscation can be an effective deterrent and may prevent the attacker from carrying out their intentions.

What are the Code Obfuscation Disadvantages?

Obfuscation doesn’t hide sensitive data

Contrary to the belief of many, code obfuscation will not protect their sensitive data, like the ChatGPT  API key, from being extracted from a mobile application binary, because code obfuscation only obfuscates the variable name holding the sensitive data and/or the function used to fetch it. Strings are left in clear in the text in the mobile app binary after code obfuscation is applied.

Obfuscation can Affect the Mobile App Performance

Code obfuscation increases the app's size which reduces its performance, which translates in less optimal user experience for the mobile app user. One of the main user retention factors is the user's first impression of using a mobile app, and if the same feels slow the chances that he bounces and uninstalls the mobile app is high.This means that when using code obfuscation, you should always measure the performance impact on the user experience with each release. If the impact goes beyond acceptable boundaries, obfuscation may no longer be an option. 

Obfuscation can hinder development and maintenance

Obfuscation can make it difficult for developers to understand and debug a released APK when the bugs don’t manifest themselves prior to the code obfuscation pass, meaning that something went wrong with the code obfuscation process or how it’s configured or not configured.

What are the better alternatives to code obfuscation for secrets protection?

If the primary objective of code obfuscation is to hide secrets within the mobile app release binary, developers must consider an alternative solution, unless they are willing to accept the risk of their secrets being shipped in plain text format within the mobile app APK. Code obfuscation still leaves the secrets vulnerable to discovery and extraction via static binary analysis. Additionally, the secrets can be stolen at runtime through a MitM attack or by use of an instrumentation framework.

Runtime Secrets

Approov Runtime Secrets diagram

In order to mitigate the risks associated with using code obfuscation to hide secrets, mobile app developers and businesses should consider alternative methods for securely delivering their secrets. One such method is to have the secrets delivered just-in-time from a backend when they are needed for an API request. For instance, instead of embedding the ChatGPT API key in the mobile app code, developers could use the Approov Just-in-Time Runtime Secrets feature, which securely delivers the secret only to mobile apps that pass a remote mobile app attestation process (1). This attestation verifies the integrity of the mobile app and device before the secret is delivered (2), providing an effective and very strong layer of security against runtime attacks, while completely eliminating the risk of the secret being stolen in a static binary analysis, which gives you the peace of mind that the API backend is only processing requests (3) from genuine instances of your mobile app.

 

Conclusion

We learned that when code obfuscation is used to hide secrets, like the ChatGPT API key, this just gives mobile app developers a false sense of security, a Maginot Line, which may come as a surprise for many, since code obfuscation has been traditionally used for this purpose.

On the other hand, code obfuscation is a useful technique to make it more difficult for hackers to understand and reverse-engineer the code of a mobile app to steal any intellectual property. Additionally, it serves as a deterrent for unskilled attackers who may seek to modify the mobile app by inserting malicious code, embedding ads, or other actions and then redistributing it through the official and alternative Android stores. Despite its advantages, code obfuscation can also have some disadvantages, such as increasing the size of the app and impacting its performance, and it may also make debugging and maintenance of release binaries more difficult for mobile app developers. 

In short, code obfuscation should not be relied on for protecting secrets in mobile apps, because secrets will not be obfuscated, instead they will be kept in clear text. Always remember that code obfuscation only obfuscates the code, not the data within it. Instead, it is important for mobile app developers to implement additional layers of protection to ensure the mobile app's security by adopting a comprehensive end-to-end mobile app security strategy, that includes the use of multiple layers of protection for the mobile app and APIs it communicates with, while keeping secrets out of the mobile app code with the use of Runtime Secrets.

If you haven't done yet now is the time to read the next article on this series about Code Obfuscation:  How to use a MitM attack to bypass code obfuscation to extract secrets from the ChatGPT mobile app.

Paulo Renato

Paulo Renato is known more often than not as paranoid about security. He strongly believes that all software should be secure by default. He thinks security should be always opt-out instead of opt-in and be treated as a first class citizen in the software development cycle, instead of an after thought when the product is about to be finished or released.