Mobile apps commonly use APIs to interact with backend services and information. In 2016, time spent in mobile apps grew an impressive 69% year to year, reinforcing most companies' mobile-first strategies, while also providing fresh and attractive targets for cybercriminals. As an API provider, protecting your business assets against information scraping, malicious activity, and denial of service attacks is critical in maintaining a reputable brand and maximizing profits.
Properly used, API keys and tokens play an important role in mobile security, efficiency, and usage tracking. Though simple in concept, API keys and tokens have a fair number of gotchas to watch out for to avoid API underprotection.
Previously, In Part 1, we began with a very simple example of API key usage and iteratively enhanced its API protection. In Part 2, we move from keys to JWT tokens within several OAuth2 scenarios, and in our final implementation, we remove any user credentials and static secrets stored within the client and, even if a token is somehow compromised, we can minimize exposure to a single API call.
For all scenarios to follow, we assume that TLS techniques discussed in Part 1 are used to keep the communications channel secure.
We will use OAuth2 [https://oauth.net/2/] terminology as much as possible. For this article, a client is a mobile application. A resource owner is the application user, and a resource server is a backend server interacting with the client through API calls. An authorization server, if present, will authenticate a resource owner's credentials and authorize limited access to a resource server. A user agent, acting on behalf of an authorization server, will gather resource owner credentials separately from a client.
At the end of Part 1, we used basic access authentication to verify user credentials and start a user session on a server. If authentication succeeds, the server returns a session key to the client. The client adds the session key to API calls, and the server checks that the session key is currently valid and uses it as a key to look up any session state it is storing for the client.
OAuth2 has become a popular way to authorize user access to protected resources. The OAuth2 authorization framework defines several authorization grant flows. Though most service providers follow the spirit of the specifications, they often choose to implement just a part of the full specifications or sometimes implement capabilities differently than specified.
The OAuth2 flow which most closely resembles basic access authentication is called resource owner password credentials grant. In this flow, the mobile client directly obtains the resource owner's id and password credentials and passes them to its back-end resource server. The back-end server validates the credentials, and returns an access token to the client.
The returned access token looks just like a session key, and it is used by the client in the same way; the access token is provided in any http request requiring authorization to make an API call. As with basic access authentication, the access token stands in for the resource owner's credentials, so once the client has received an access token, the client should discard the credentials. The resource owner must trust that the client is not retaining these credentials.
Where an access token differs from a session key is in how the token is interpreted. JSON Web Tokens (JWT) are a secure, URL-safe method for representing claims and are often used as OAuth2 access tokens. The JWT.io site provides a convenient place to experiment with tokens.
A JWT contains a JSON formatted payload describing a set of claims. Common claims include:
"iss" - identifies who issued the token
"sub" - the principal subject of the claims, often the resource owner
"aud" - the intended audience for the claims, often the resource server
"exp" - the expiration timestamp of the claims
The access token is also called a bearer token and is passed with every API call, typically as an HTTP request header:
There are different ways that the token can be validated by the resource server. One common approach is to sign the JWT token using a secret known to the both the resource server and the authorizing service, which in this case is the resource server itself. An attacker cannot modify a token's claims without invalidating the signature. The client can read the claims if necessary, but it does not know the secret, so it cannot verify the token itself. Neither the user credentials nor the signing secret are stored on the client, so they cannot be extracted through reverse engineering the application.
The server can validate the signature and check that the claims have not expired. With basic access authentication, the session key is used to retrieve state stored in the back end. If there are multiple servers which can service a request, then accessing this state and synchronizing it between servers can be a performance bottleneck. With access tokens, authorization state, such as the subject, is stored on the client, and that state is provided with every call. If the client maintains and provides the equivalent of all session state with every call, then the API protocol is stateless, and any server is free to handle a request independently of other servers, greatly improving system scalability.
Since the signing secret is not stored on the client, the secret can be changed on the server without requiring changes to the client. Access tokens are valid until they expire, so if the expiration window is long, a stolen token could be used successfully by an attacker for quite a while. If suspicious behavior is suspected, a token can be blacklisted on a group of servers. A revoked token will fail validation, triggering a new password credential sequence for the resource owner.
Security can be enhanced if resource authorization is separated from resource access. If separated, then only the authorization server needs to handle user credentials. The user credentials are never exposed to the client or the resource server.
In OAuth's implicit grant type, the authorization and resource servers are separated. The client sends the resource owner, through redirection, to the authorization server's website. The local user-agent, usually a browser, separate from the client, submits the credentials. The authorization server validates the credentials and redirects the access token through the user agent and back to the client.
Along with the user credentials, the authorization server can receive a client identifier and a requested scope. The authorization server can use the resource owner's id, the client's id, and the requested scope to determine which resources the client is allowed to access and how the client can access or modify those resources. A single authorization server can therefore manage the security policies for many users across many clients and many resources.
Access tokens passed from client to resource server can be verified by the resource server using the same secret used to sign them. Both authorization and resource servers share this secret, but this secret is never exposed to the client or user agent. A separate system administrates user credentials, client ids, resource access scope, and shared secrets.
This grant type is considered implicit because the resource server implicitly trusts that the requester submitting the access token is the client. This can be a risky assumption. For example, if an attacker can compromise the user-agent, the attacker may be able to view the access token and subsequently use the token to make valid but malicious API calls.
The authorization grant type adds client app authentication to the implicit flow.
Authorization is split into two steps. In the first step, the resource owner provides credentials through the user agent, but this time only an authorization code is returned to the client. The client calls back to the authorization server with the authorization code and an authentication secret. The authorization server then returns the access token directly to the client.
By separating the authorization process into two steps, the access token does not flow through the user agent which is a big improvement in security. If the authorization code is exposed by the user agent, an attacker cannot make use of that code unless he can authenticate himself using the client secret.
As we know from Part 1, a static secret stored on the client is hard to hide from a determined attacker. We will discuss how to remove static secrets from the client shortly.
The OAuth2 spec does not require client app authentication beyond the authorization flow, but ideally both user access, through the access token, and app access, through the client secret, should be sent with each API call. The resource server should validate both before allowing access to resources.
One great thing about access tokens is that they have an expiration date. If they are somehow exposed, they are only useful for a limited amount of time. Since the resource owner has to enter his credentials each time to get a token, if lifetimes are short then users will get annoyed at having to repeatedly reauthenticate. Conversely, if lifetimes are long, an exposed token can do a lot of damage before it expires. A token suspected of being stolen can be revoked, but this will not undo any damage which occurred before detection.
With the authorization grant type, OAuth2 optionally allows the use of refresh tokens. A refresh token can be received along with an access token during the initial authorization grant. Now an access token can be given a short expiration window, and when it expires, the refresh token can be sent to receive a fresh access token.
Refresh tokens have longer lifetimes than access tokens. The resource owner will not need to reauthenticate until the refresh token expires. If an access token is compromised, then its malicious use is limited to a short time. If a refresh token is compromised, it has a longer lifetime and can be used to generate additional access tokens. As such, refresh tokens are usually subject to strict storage requirements to ensure they are not leaked. They can also be blacklisted by the authorization server, which will trigger a new resource owner credentials session. I strongly recommend authenticating the client app during any refresh token operation, but this is not required by the OAuth2 spec.
With each refresh, in addition to the new access token, a new refresh token can also be sent. The old refresh token can be immediately blacklisted, or more complicated token rotation schemes can be used to frustrate the malicious use of any individual token.
The client secret should be used to authenticate the client app for:
initial access token grant
every access token refresh
every API call
Many people do not send the client secret with every API call, arguing that since the client was authenticated using the secret during the authorization tokens grant, it is redundant. I prefer to include it as one additional check so you authenticate that it is still the client who is using the access token.
Unfortunately, the client secret is statically stored in the client app, and as such, it is vulnerable. We can remove the static secret from the app by following a playbook similar to how OAuth removes the resource owner's credentials from the client, by delegating to an app authentication service. In this case, the client must present its app credentials to the authentication service, the service authenticates the app, and the client receives its own authentication token. Instead of authorizing user access to the resources, this token authorizes client access to the resources.
An app authentication service uses the unique characteristics of the client app to attest the app's integrity and authenticity. An example of a unique characteristic might be a simple hash of the application package. The integrity of this simple attestation depends on the integrity of the hashing computation. Such a simple scheme might be fairly easy to spoof.
A more robust attestation service might use a random set of high-coverage challenges to detect any app replacement, tampering, or signature replay. An example of this type of service is Approov (full disclosure, this is from my company, CriticalBlue). If the responses satisfy the challenges, the authentication service returns an authenticating, time-limited client integrity token signed by the client secret. The token can be verified by the the resource server, which also knows the client secret. If the attestation fails, the service still returns a time-limited token, but it will fail to be verified by the resource server. The attestation service and the resource server share the client secret, but it is no longer stored in the client.
With every API call, both the client integrity token and the oauth2 access token are sent in the request headers. The resource server is modified to validate both tokens before handling the request.
Unlike user authentication, client authentication requires no user interaction, so client integrity token lifetimes can be extremely short, and no refresh tokens are needed.
Since the client secret is no longer stored in the app, if the client secret is somehow exposed, it can be replaced with a fresh secret without requiring any changes to the installed client base.
The initial authorization grant flow must be modified to add app authentication. Since the grant spec requires a static client secret, a mediation server is introduced. The client authenticates itself with the app authentication service and receives a client integrity token. As before, it receives an authorization code from the resource owner's authentication. It sends both to the mediation server. If the client integrity token is valid, the mediator sends the authorization code and the client secret to the authorization server which returns the user access token which is then returned to the client, to be used for subsequent API calls until a new user access token is needed.
To refresh the user access token, the same mediation server can be used. This time if the client integrity token is valid, the refresh token is passed on to the authorization server and if valid, a fresh user access token is returned. Even though user access token refresh does not require app authentication, we were able to strengthen the refresh checks by adding the client integrity token validation to the mediator.
Using this approach, we are able to protect every API call made to our resource server validating both user and app authenticity, and we did it all without exposing user credentials or the client secret to the client itself.
In Part 1, we demonstrated use of client secrets and basic user authentication to protect API usage. In Part 2, we introduced JWT tokens and described several OAuth2 user authentication schemes. On mobile devices, static secrets are problematic, so we replaced static secrets with dynamic client authentication, again using JWT tokens. Combining both user and app authentication services provides a robust defense against API abuse.
In Part 3, we will discuss a few remaining threat scenarios in the user authentication flow.