Even if you only have a vague interest in app security I’m sure the recent furore around Clubhouse hasn’t escaped your attention. There is significant buzz around this invite-only iOS app. Enabling live audio chat rooms between participants, it sets the expectation that these interactions are somewhat private and certainly not recorded. With big celebrity names such as Elon Musk, Kanye West and Oprah Winfrey as users there is a significant demand for a coveted invite.
A few days ago the story broke that at least one user had found a way to stream the live audio feeds from multiple chat rooms onto the web so that anyone could hear them. This isn’t a “data breach” as such but a “data spillage”. In other words, the servers hadn’t been compromised but instead somebody had found a way to impersonate the Clubhouse app so that they could gain access to the streams and relay them onto a web server for all to listen. In other words, this is really yet another example of API scraping or abuse.
Of course that particular user’s account was soon blocked:
But others will follow. As @_DanielSinclair puts it:
Once again we see an app becoming a viral sensation, and swiftly grabbing the attention of the reverse engineering community. Remember the Pokémon Go sensation from 2016, and the cat and mouse games played out with the API being reversed to allow cheating, with the developer Niantic putting in place further safeguards that were then swiftly broken yet again. And yes, this particular game is still playing out.
The fault in the reasoning in these two cases is identical. The security boundary is not defined by what you do in the app - the app doesn’t allow you to rebroadcast chat rooms to the web. The security boundary is really the API, and if it can be used by code other than the app then unintended things become possible.
This thread provides the story so far. The clubhouse API had been reverse engineered from the mobile app. The code to access it is in the public domain and, since the app doesn’t even use pinning, it’s traffic is easy to observe using a MITM proxy. The documentation of the API is excellent using an open API spec.
Clubhouse actually employs Chinese vendor Agora to do the audio streaming itself, as you can see in this architecture diagram. A PubNub is used to handle real time textual communication. The primary purpose of the Clubhouse API is to provide the user management and to issue the special tokens that are used to access Agora via its iOS SDK.
The Agora audio stream itself appears to be encrypted (although not end-to-end) so you can’t simply scrape the audio from the Agora UDP level communication itself.
If you can get access to the Agora tokens then it’s possible to join a chat room and stream the audio. The problem of course is that if you can do that outside of the imposed restrictions of the official mobile app, then you can relay the audio (or the Agora access token) anywhere you choose, such as a publicly accessible web site.
To get hold of an Agora token you need to login via the Clubhouse API itself. But as the specs for this API are now reverse engineered and published you don’t necessarily need to use the official app. You simply need to be an account holder and to provide your credentials to the right API endpoint. This gives an user authorization token and from there you can join a channel and get the Agora token.
But surely you should only be able make this API call from the official mobile app? Well, not really. There appears to be some basic checking on the headers that are presented with the request. But these are easy to determine by looking at the traffic coming from the real mobile app, and once the right combination is found you are straight into the club:
This is really no barrier at all, and the reverse engineering community can quickly catch up should this magic entry requirement change.
There were some issues that allowed a single user to stream multiple chat rooms at the same time, and these have been fixed. But there is nothing stopping a single stream at a time. After all, that’s exactly what the official app is designed to do.
The key to improving the security posture is to not allow any sign-in requests from anything other than the official mobile app. Furthermore, those sign-in requests need to happen over a pinned https channel to make it extremely hard to intercept any of the traffic.
If only the official mobile app can login and get an authorization token, then only the official mobile app can get the required Agora token to access the channels. And the official app doesn’t allow rebroadcasting.
Of course you can never stop somebody recording the analog audio from the official app and rebroadcasting it - although even then there are interesting things you can do with watermarking on the device to determine the leaking account.
You also need to check the environment that the mobile app is running in for complete protection. If the app is running on a jailbroken device, has been tampered in some way, is being debugged/instrumented or pinning is being bypassed then this must be detected and blocked.
This is what Approov is - a professional bouncer for your API front door, ensuring you get to decide who, and what, gets entry to your club.