Nick Fishman

  • Archive
  • RSS

Reverse Engineering Native Apps by Intercepting Network Traffic

The ability to debug web applications is baked into every major browser — just click Inspect Element and you’ll see lots of information. It’s not quite as easy to do this with native apps, especially if you don’t have their source code. I’d like to show you how to understand the behavior of an application by inspecting its network requests (with or without SSL). I’ll also discuss some security implications relevant to developers who are building their own API (private or public).

Note: I am not a lawyer and this is not legal advice, however using these techniques may violate a product’s TOS and may be illegal in some cases. Consult a lawyer if you have any doubts.

Setup

  1. Install Charles Proxy
  2. Configure your device to use the computer as a proxy
  3. Install the Charles root certificate on your device
  4. Start analyzing!

You can skip right to the Traffic Analysis section to see this in action.

1. Install Charles Proxy

Get it online at http://www.charlesproxy.com/. You can start out by using the trial version, but I highly recommend buying it. It’s a great use of $50 for all of its features. Besides, you’ll probably find it annoying that it closes after 30 minutes of use :-)

2. Configure your device to use the computer as a proxy

First, make sure your device and computer are connected to the same wireless network. If you’re using an iPhone or iPad: go to Settings -> Wi-Fi, then click the blue arrow on the far right of your network. Scroll down to HTTP Proxy and choose Manual. Enter your computer’s IP address and 8888 for the port.

image

If you’re using an Android: go to Settings -> Wi-Fi, long press your network, then press Modify network. Press Show advanced options, set Proxy settings to Manual, and similarly put in your computer’s IP address and port 8888. If you don’t see that setting, make sure you have a recent version of Android. Most older versions of Android before Ice Cream Sandwich don’t let you configure the HTTP proxy, so you won’t be able to use this technique.

image

3. Install the Charles root certificate on your device

The above setup will let you intercept regular traffic, but you won’t be able to make sense of encrypted traffic. Many apps use HTTPS for their network communication, so this step will let you analyze them as well.

You’ll need to install the Charles root SSL CA certificate on your device. As of May 2013 this can be found at http://charlesproxy.com/charles.crt, but check the latest documentation (here and here) if that link doesn’t work. Put that URL into Safari or Chrome on your device, approve it, and you’re almost good to go.

SECURITY NOTE: Make sure to remove the Charles certificate when you’re done debugging, or your legitimate HTTPS traffic could be compromised later. See Final Notes below.

image

Behind the scenes, Charles will effectively perform a man-in-the-middle attack on your encrypted traffic. It will supply its own certificate to your device and the destination host, transparently encrypting and re-encrypting all data sent and received.

Inside Charles, go to Proxy -> Proxy Settings and click the SSL tab. Ensure that Enable SSL Proxying is checked. Click Add and type in * for the host field. This will allow Charles to intercept traffic sent to any host. For fine-grained debugging, you should replace the wildcard with only the hosts that you specifically care about.

image

If Charles asks you to automatically configure your computer’s Network Settings, go ahead and reject it. This post isn’t about debugging local applications on your computer (although this has many possibilities).

NOTE: Some apps may stop working or report connection errors if you try to inspect their traffic this way. This is likely due to additional security on their part (see SSL Pinning, below).

4. Start analyzing

Open up a browser on your device. Charles should display a prompt asking you to approve your device. Click Allow and you should start seeing data appear in the main window.

Traffic Analysis

For this walkthrough, I wrote a simple iOS app that does simple authentication against a backend, sends the user’s current latitude and longitude, and displays a list of nearby places returned by the backend. However, for the sake of science let’s assume we’re analyzing an app that we downloaded from the App Store, and we know nothing about how it works. Let’s see how that looks inside Charles.

image

Clicking on the individual items shows the request and response bodies. Charles comes with some very nice functionality by default, such as JSON parsing, timing charts, and more.

image

image

image

We can now deduce quite a bit of information:

  • The app is communicating with a backend at sandbox.nickfishman.com
  • It’s using Flurry Analytics and Apple Push Notifications
  • It’s using a RESTful web API and supplying an access_token parameter to each request

Playing with the app further would let us enumerate most of the backend’s API endpoints and learn how they operate. We also have a valid access_token, so we can probably masquerade as a legitimate client and hit the API programmatically (even if the API is private and undocumented).

Useful Charles features

Setting breakpoints and modifying requests in-place: One of the most powerful features of Charles is the ability to set breakpoints on certain URL patterns. When a request triggers such a breakpoint, Charles lets you modify the request headers and body in-place before sending them on to the server. This lets you instantly test how the backend reacts to changing parameters, without needing to build even a simple custom HTTP client. In our example, we could forge the latitude/longitude parameters sent by the client and see how the backend responds to different locations around the globe.

image

Replaying requests: We can replay any of the requests in the list to see how the backend reacts. Right click on a request and click Repeat. We can also stress test a particular API endpoint by issuing many requests in parallel (Repeat Advanced).

Simulating latency: Charles also lets us place artificial limits on the bandwidth and latency of requests. This helps simulate slow network conditions, which sometimes cause apps to severely misbehave. For Mac users, you can do even more testing along these lines using Apple’s Network Link Conditioner.

Saving sessions: You can save an entire recording session for later reference or processing.

See the documentation for more details on these features and more.

Security implications and preventive measures

This example shows how easy it is to intercept and even actively modify a native app’s network traffic. Furthermore, it shows that even HTTPS APIs aren’t safe from this kind of meddling. How can we build secure web APIs in the face of such challenges?

Never trust the client

Perhaps the most important conclusion: never trust the client. It’s easy to masquerade as the client without disrupting any of the cryptography behind the authentication process. A client with a valid auth token can still misbehave — either intentionally or due to buggy code. We shouldn’t assume the client will always use the API as intended.

Well-known public APIs go to great lengths to handle abusive clients. Some preventive measures include rate limits, internal alerts for clients that supply invalid parameters or make strange API calls, and offline data mining to classify abusive or fraudulent behavior. You might consider implementing some of these techniques even if your API is private or undocumented. Check out nginx’s HttpLimitReqModule and Netfilter limits as starting points.

One of the most unsettling things about this example is how easy it is to intercept an app’s secure communication data. Certainly, this requires the user to install a rogue root CA certificate, so we can’t just magically intercept random SSL traffic on a network without causing warnings to appear on people’s devices. But our threat model here doesn’t involve an attack on an unsuspecting user. Rather, it involves an attack on the app itself and its network API — an attack that requires physical device access. Engineers who assume that flipping on HTTPS safeguards app data from being tampered with are making a classic “security through obscurity” argument.

SSL Pinning

One of the most promising solutions to the rogue root CA certificate problem is SSL pinning, also known as certificate pinning or certificate validation. In this technique, a known copy of the server’s certificate is bundled with the app itself. When the app makes an HTTPS request, it validates the server’s provided certificate against its known copy. If the OS validates the certificate chain against a (potentially rogue) root CA but the certificate doesn’t match the app’s expectations, it will reject the connection and display a network error.

The Chrome and Twitter iOS apps have already implemented this technique. For example, here’s what happens if I try to use Chrome through Charles:

image

You can learn more about how to implement SSL pinning in this blog post and in Twitter’s Security Best Practices guide. Keep in mind that SSL pinning can also be broken by jailbreaking or rooting the device and modifying the app binary (see the iSEC Partners presentation at blackhat USA 2012). However, this attack is more complicated and requires a motivated attacker. SSL pinning is still a good approach for keeping your app’s API traffic hidden from casual observers.

Final notes

When you’re done debugging, it’s a good idea to remove the Charles root CA certificate from your device. Otherwise, there’s a chance that legitimate encrypted traffic from your device (such as usernames, passwords, Facebook session tokens, payment information, etc) could be intercepted by someone else on your network with Charles. It’s unlikely, but don’t take that chance. On iOS, go to Settings -> General -> Profiles, and remove the Charles Proxy one. On Android, go to Settings -> Security -> Trusted credentials -> User, and remove the Charles one.

In this guide we used Charles Proxy because it’s super flexible and yet easy to use. A powerful, free, and open-source alternative is mitmproxy — it’s definitely worth checking out.

As you’ve hopefully seen, viewing native app network traffic is almost as easy as using Inspect Element in the browser. In general, you’re in trouble if the security of your app depends on nobody discovering the inner workings of your private API. On the other hand, you’re in much better shape if your app keeps all its confidential and valuable business logic in the backend, the backend uses a sensible authentication protocol, and the only real attack against your API is a DDoS.

Happy reverse engineering!

    • #security
    • #ssl
    • #reverse engineering
    • #sniffing
    • #charles proxy
    • #ios
    • #android
    • #ssl pinning
  • 5 days ago
  • Comments
  • Permalink
Share

Short URL

TwitterFacebookPinterestGoogle+

Node.js HTTP requests with gzip/deflate compression

One of my recent projects involved scraping some web data for offline processing. I started using the excellent request library by Mikeal Rogers, which has a number of nice and convenient improvements over the default Node http library.

As I unleashed my first prototype on the web, the database started growing much faster than I had planned. I started by storing raw and uncompressed response data, so an immediate optimization was to use the Accept-Encoding HTTP request header to fetch compressed data from the server.

Unfortunately, some of my target servers sometimes sent back uncompressed data (which they’re entitled to do under the HTTP spec, it’s just slightly annoying). I needed a way to conditionally handle compressed data based on the Content-Encoding response header. I found a solution that worked with the default Node.js HTTP library, but it wasn’t immediately obvious how to port that to Mikeal’s request library.

Approach 1: no streams

My first solution collected data chunks into a Buffer, then passed that into the relevant zlib functions if needed. It’s more code than I wanted, but it works well.

Note: for simplicity, I’ve left out the logic that writes the compressed response body to the database.

https://gist.github.com/5499763

Approach 2: streams

The downside to the first approach is that all response data is buffered in memory. This was fine for my use case, but in general this can cause memory issues if you’re scraping websites with really large response bodies.

A better approach is to use streams, as Mikeal suggested. Streams are a wonderful abstraction that can help you manage memory consumption better, among other things. There are two great introductions to Node streams here and here. Keep in mind that streams in Node.js are somewhat intricate and still evolving (for example, Node 0.10 introduced streams2 which is not entirely backwards compatible with older versions of Node).

Here’s a working solution that pipes response data into a zlib stream, then pipes that into a final destination (a file, in this case). Notice that the code is cleaner and more readable.

https://gist.github.com/5515364

Summary

Both of those approaches will get the job done with Mikeal’s library, and the one you choose depends on the use case. In my project, I needed to save the compressed response data as a field of a Mongoose document, then further process the decompressed data. Streams don’t suit this use case well, so I used the first approach.

    • #gzip
    • #deflate
    • #http
    • #tech
    • #zlib
    • #nodejs
  • 2 weeks ago
  • 1
  • Comments
  • Permalink
Share

Short URL

TwitterFacebookPinterestGoogle+
Inbox Zero, for several reasons:

I got tired of seeing someone clapping (ad infinitum) about having an empty inbox, and wanted to try it myself.
Is this real life?
It feels amazing.
Next up: Getting Things Done.
Pop-upView Separately

Inbox Zero, for several reasons:

  • I got tired of seeing someone clapping (ad infinitum) about having an empty inbox, and wanted to try it myself.
  • Is this real life?
  • It feels amazing.

Next up: Getting Things Done.

    • #inboxzero
    • #gtd
  • 11 months ago
  • Comments
  • Permalink
Share

Short URL

TwitterFacebookPinterestGoogle+

When I get my inbox to zero, I’m like

runningastartup:

via Shirley

  • 1 year ago > runningastartup
  • 49
  • Comments
  • Permalink
Share

Short URL

TwitterFacebookPinterestGoogle+
I propose a revised version of the famous proverb: “The way to a man’s heart is through his startup.”
Pop-upView Separately

I propose a revised version of the famous proverb: “The way to a man’s heart is through his startup.”

    • #ampcloud
  • 1 year ago
  • Comments
  • Permalink
Share

Short URL

TwitterFacebookPinterestGoogle+
Coding in the cloud (almost)
Pop-upView Separately

Coding in the cloud (almost)

  • 1 year ago
  • 1
  • Comments
  • Permalink
Share

Short URL

TwitterFacebookPinterestGoogle+
Pop-up View Separately
Pop-up View Separately
Pop-up View Separately
Pop-up View Separately
PreviousNext

Food fun with a friend.

The best part: I exported these from Google+ and got prompted to tag the “face” in one of them. Can you guess which?

    • #misc
    • #foodfun
  • 1 year ago
  • 1
  • Comments
  • Permalink
Share

Short URL

TwitterFacebookPinterestGoogle+

Plink, a collaborative HTML5 music game

tulpinspiration:

Plink is a very original idea. It’s an online, collaborative, multiplayer music making toy made byDinahmoe. It uses Node.js and WebSockets to create an multi-user “chatroom” but instead of entering text to have a chat, the interface generates music!

Pulsating circles are generated by moving the mouse over a <canvas> element. Clicking and holding the mouse generates a musical tone. The colour of the circle determines the type of audio you play such as high or low notes, and is created using Google’s Web Audio JavaScript API. What a great combo of web technologies, and great fun too!

Check out a video of it in action: http://vimeo.com/26271666

There are some cool things going on here. On the client, they’re using the Web Audio API (available in recent versions of Chrome and Safari) to dynamically play sounds, and WebSockets to make the experience live and interactive. They’re using Node.js on the server.

It’s worth noting that the sounds all come from a pentatonic scale. This is why the music miraculously doesn’t sound cluttered or discordant, even when lots of people are playing. You’ve probably experienced something similar if you’ve ever tried playing just the black keys on a piano: no matter what order you play them, you just can’t go wrong.

Each instrument is sampled across 16 tones. Try some of them to hear what I mean:

  • http://labs.dinahmoe.com/plink/sounds/bziaou_11.ogg
  • http://labs.dinahmoe.com/plink/sounds/bziaou_12.ogg
  • http://labs.dinahmoe.com/plink/sounds/bziaou_13.ogg
  • http://labs.dinahmoe.com/plink/sounds/bziaou_14.ogg
  • http://labs.dinahmoe.com/plink/sounds/bziaou_15.ogg
  • http://labs.dinahmoe.com/plink/sounds/bziaou_16.ogg

On a tech note, the client code isn’t the cleanest thing in the world. It would be easier to read (and maintain) if it used Socket.IO instead of raw WebSockets, and if it used jQuery or some other JavaScript library to manipulate the DOM.

Still, a very innovative use of the Web Audio API and quite fun to play with.

    • #music
    • #nodejs
    • #socket.io
    • #tech
    • #websockets
    • #javascript
  • 1 year ago > tulpinspiration
  • 3
  • Comments
  • Permalink
Share

Short URL

TwitterFacebookPinterestGoogle+

Speeding up Mongoose queries by requesting only the fields you need

I’m currently building a startup (ampcloud) with Node.js, MongoDB, Mongoose, and a handful of other tools. After spending quite a few years in the Django world, it’s been fun doing a mental context switch into the land of JavaScript, callbacks, and closures. Occasionally I’ve run into some gotchas, and this particular one is a great example.

Let’s say you’re building a blog, and part of your database schema looks something like this:

var CommentSchema = new Schema({
  title: {type: String},
  body: {type: String},
  createdAt: {type: Date}
});

var PostSchema = new Schema({
  author: {type: String},
  title: {type: String},
  createdAt: {type: Date},
  slug: {type: String},
  comments: [CommentSchema]
});

module.exports.Post = mongoose.model('Post', PostSchema);

Every post is stored as a separate document in MongoDB, but all comments are embedded within it. This means that when you fetch a post, you’ll get all the comments back with it.

Now let’s say you want to display a list of the 20 most recent blog posts on your home page. Assuming you’re using Express, you would write a view like:

app.get('/', function(req, res) {
  Post
    .find()
    .asc('createdAt')
    .limit(20)
    .run(function(err, posts) {
      if (err) {
         res.render('error', {status: 500});
      } else {
        res.render('allposts', {posts: posts});
      }
    });
});

You’d also want to add an index to allow efficient querying by date created:

PostSchema.index({createdAt: 1});

Your blog will probably work well at first, but you’ll run into problems as soon as one of your amazing posts goes viral and gets thousands of comments. You’ll notice that your main page starts taking a lot longer to load. Even when you’re the only one browsing your blog, it just won’t feel as snappy anymore.

Beware: Mongoose fetches all fields by default

The culprit is the comments field. Because a Mongoose query requests all fields of a document by default, every site visitor will cause it to request and parse the entire list of comments. Every time. You don’t even need the list of comments to render the main page.

Let’s get rid of the comments field by adding the following line to the query chain:

    .exclude('comments')

The final result:

app.get('/', function(req, res) {
  Post
    .find()
    .asc('createdAt')
    .limit(20)
    .exclude('comments')
    .run(function(err, posts) {
      if (err) {
         res.render('error', {status: 500});
      } else {
        res.render('allposts', {posts: posts});
      }
    });
});

You’ll find that this performs a lot better. The problem isn’t so much that MongoDB can’t return the data quickly enough. Rather, Node.js has to spend much of its time parsing extra JSON into JavaScript objects, which is both unnecessary and time-consuming.

Not surprisingly, I recently encountered this issue in production. I made the fix right at 3:00 GMT, and the load dropped dramatically.

Takeaway: think about your queries

When your models start accumulating lots of data, think about whether you can request a subset of fields when making queries. See the Mongoose query documentation for details.

Caveat: Keep in mind that you won’t gain much by excluding fields that store primitive types like Strings, Numbers, or Dates. Even worse, your code will probably get harder to read and maintain. Only make such optimizations when you have to.

Some final notes

The above schema suffers from a fundamental flaw: it doesn’t scale well. If a blog post gets thousands of comments, you’ll probably want to paginate the comments and only show several hundred at a time. But with this schema, you can’t ask MongoDB for a subset of comments. You can only get all or nothing.

To make this production ready, you’d probably want to separate Comment and Post into separate Mongoose models, instead of nesting Comments within Posts as embedded documents. Each Comment would be a separate MongoDB document, you’d store the Post id within the Comment, and you could efficiently query for random subsets of comments on a particular blog post.

    • #mongodb
    • #mongoose
    • #nodejs
    • #tech
    • #ampcloud
  • 1 year ago
  • 4
  • Comments
  • Permalink
Share

Short URL

TwitterFacebookPinterestGoogle+

Some new beginnings

Greetings, fellow tumblrs and other readers.

I’ve created this space as a way to share interesting, sometimes relevant, occasionally whimsical, and hopefully useful thoughts about technology, entrepreneurship, people, music, and other important life matters.

Ideally, this would be part of a personal website that looks flashy and represents me perfectly. There’s a Russian proverb that my parents bring up at times like these:

“Лучшее — враг хорошего.”

“The best is the enemy of the good.”

—Voltaire

With that in mind, this will do for now.

  • 1 year ago
  • Comments
  • Permalink
Share

Short URL

TwitterFacebookPinterestGoogle+

Nick Fishman

Portrait/Logo

About

I'm a software engineer and entrepreneur. I like to solve high-impact problems with technology. I'm also the CTO and co-founder of sonicpanther.
Follow @nickfishman

Social

  • @nickfishman on Twitter
  • Google
  • Linkedin Profile
  • nickfishman on github
  • RSS
  • Random
  • Archive
  • Mobile

© 2013 Nick Fishman. All rights reserved..

Effector Theme by Pixel Union