Defeating a dockerized API to get access to source code

Defeating a dockerized API to get access to source code

A few weeks back, there was some interesting feedback from my article about finding exploitable vulnerabilities in APIs using code review and taint analysis. A couple of people pointed out that they never have access to the API source code.

I hear you. But is that always true? Sometimes, the source code is right in front of you… if you know where to look.

In this article, I will show you how to break into a docker container and dump everything you need to extract the source code from a target API you are responsible for testing.

Let’s get right to it!

Gaining access to a containerized API

You’d be surprised how often I have encountered interesting APIs during an appsec pentest that I have never heard of before. One of the first things I do when this happens is to figure out if I can get my hands on the API.

I’ll search to find the vendor and see if I can download and install the API bits. I’ll search public source code repos like GitHub. Google dork until I am blue in the face. And at some point, I may find myself on Docker Hub.

Why Docker Hub?

Docker Hub has a powerful search engine that lets you anonymously search across all containers they host for the packages delivering the API. Sometimes that might be from the API publisher itself. However, you may often find them packaged by someone else that includes the API as part of a customized container.

And if you find it, you can usually pull it down directly to get what you need.

Let’s use our favorite Completely Ridiculous API (crAPI) published by OWASP. Even though they publish the source code to their API on GitHub, I will show you how to recover it directly from the published container hosted on Docker Hub.

Here we go!

Searching with Docker Hub

Head over to and do a simple search for crapi. You can immediately see several container images of interest. In fact, the first result of crapi/crapi-identity is an ideal image to interrogate.

Now that we found the image we want, we can pull it down using the docker CLI.

(I am of course assuming you already have docker installed on your local system)

$ docker pull crapi/crapi-identity

Now let’s check to see if you have the images downloaded locally:

$ docker images

In my case, I have all the images locally relating to crAPI as I used their docker-compose.yml file to download everything pertaining to their web app and API.

The point is you only need the crapi-identity image for this exercise. As long as you see it listed in the output of your images query, you are ready to proceed.

Dumping API artifacts and decompiling to source

So here is an interesting tidbit of information for you. Did you know you can actually gain access to the contents of a docker image without actually running it? Yep. As long as you create a container based on the image, you can access it without even running it.

Let’s do that.

Create your container from the target image

$ docker create crapi/crapi-identity

Copy the container id returned; you’ll need that in the next step.

Copy all files out of the container

Now that you have created a container based on the image downloaded, we want to copy all the files out of it into a temporary working directory. Some people might argue it is better to start the container, get an interactive shell, and then browse to find the correct files and only extract the smallest file set.

I disagree.

You want to get an exact copy of all the files before the container ever runs and get it out of there so you have it permanently. This also prevents startup code from running that could manipulate the files or call out to a licensing server or whatnot that could leak information about you and/or your docker install.

Let me give you a couple of real-world examples (aka war wounds I have) of why this approach is essential.

  1. I once stumbled upon a hardcoded API key in the default image that replaced the value with environment variables configured and set on startup. The default key was valid and gave complete admin access to a critical backend API. Had I waited until the container spun up, I would have lost that information.
  2. I once triggered an alarm when I ran a container without realizing it had a call-home feature that immediately alerted the SOC to my presence. They were forewarned and were monitoring my VPN endpoint. They quickly locked me out of the backend APIs I was trying to test because I had not configured the container cluster correctly, and it reported that back to them. While it was easy enough to dispose of that IP and get another one when it was time to attack the API, it was unnecessary additional work that I could have avoided if I had just been more careful.

Hopefully, I’ve now convinced you WHY you need to copy all the files out of the container. Let’s do that.

Pick a base directory to work out of. I will choose ~/crapi/poc/. YMMV.

Create a temp directory in your working directory and copy all the files there using the docker CLI.

$ mkdir tmp
$ docker cp <containerid>:/ tmp

Now that you have all the files, it’s time to figure out how the container starts up and where the API artifacts may reside.

Extract container startup command

So when a container is first created, the quoted command listed by running docker ps represents the first command to be executed on startup.

Chances are, you won’t see anything returned because the docker ps command only displays the currently running containers by default. However, with some creative arguments, you can easily extract how the container will start up:

$ docker ps -a --no-trunc -f "id=<containerid>" --format "{{.Command}}"

Let me run you through the arguments:

  • -a : Show all containers (not just running ones)
  • –no-trunc : Do not truncate any of the output. This is important so you can see the complete Command column.
  • -f “id=<containerid>” : Run a filter. We are filtering on the container id we got when we created it.
  • –format “{{.Command}}” : Format the output to only show the Command column data

The final output will look something like this:

And there we have it. We can immediately determine that the crapi-identity image is built to run a Java microservice, and the API artifacts are SOMEWHERE in /app/user-microservices-1.0-SNAPSHOT.jar.

Now, if this container ran something like NodeJS or PHP, we would have direct access to the source code files. In this case though, we will need to crack open the JAR file and decompile all the byte-code class files.

Decompiling the Java app

So if you are new to Java, it is a cross-platform programming language and computing platform. You take source code (files with a .java extension) and compile them into byte-code (files with a .class extension), archived together into .jar files. The Java Virtual Machine (JVM) acts as the runtime engine to execute the byte-code. The JVM is part of the Java Runtime Environment (JRE) that is typically installed in the container.

With me so far?

So we need to reverse that process here. We need to take the user-microservices-1.0-SNAPSHOT.jar file, break it down into the proper class files and then decompile it back to the Java source code.

There are tools that can help us. One of the most popular ones is JD-GUI. It’s a graphical user interface that lets you browse JAR files, decompile them into source code files, and export all the source code into a zip file.

However, when working quickly at the command line, I prefer to use jd-cli. It breaks me from the shackles of a GUI tool that requires human intervention with mouse clicks and allows me to automate the entire decompiling process.

Although I won’t show it here, I built my own tooling that can pull down any new docker image belonging to my targets, extract all its byte code, decompile it back into source code (for both Java and .NET core), run a rough source code audit with graudit automatically and report back to me any morning that something new and interesting is found.

I’ve found several vulns this way. YMMV of course.

Anyways, back to our target here. Knowing where the JAR is, we can create a src folder in our working directory and let jd-cli do all the work:

$ mkdir src
$ jd-cli tmp/app/user-microservices-1.0-SNAPSHOT.jar -od src -g ALL

When the command finishes, we can move into the src directory and immediately see several directories. The one we are interested in is the BOOT-INF folder. If we start to dig into the file structure, we can end up in BOOT-INF/classes/com/crapi… the core src for the identity microservice.

Begin your code review and taint analysis

To finish out this methodology and make use of the source code, let’s do a quick source code audit looking for any dangerous functions we can exploit.

$ cd BOOT-INF/classes/com/crapi
$ graudit -d java -L . | grep exec

Bingo. We immediately find a potential command injection vulnerability in the identity server API.

How do I know this? You should go read my article on tracing API exploitability through taint analysis and find out. 🙃

Cleaning up after yourself

There is one final step you need to do after you have extracted the API artifacts and got everything back to the source code. You need to kill off the container you built. You never want to run it later accidentally.

$ docker rm <containerid>


So there you have it. A process for defeating a dockerized API to get access to the source code. This can be extremely useful in cases where you need to do a more in-depth analysis of the code to find and exploit vulnerabilities in the APIs you are responsible for testing. Of course, your mileage may vary, but this is certainly a process I regularly use in my work when I do not have access to source code.

It might very well work for you too!

Want more insights into resources about API hacking? Make sure you download my free Ultimate Guide of API Hacking Resources.

Dana Epp