September 26, 2023

Finding Hidden API Endpoints Using Path Prediction

Introduction

Recently, I was talking in a Slack channel on the OWASP workspace about contextual discovery of hidden API endpoints. It reminded me that newer API hackers may not understand how important it is to think like a developer.

In this article, I’ll explain contextual discovery and show you how to enhance API security testing by identifying hidden endpoints through predictable paths and routes.

Let’s get to it!

The Concept of Hidden API Endpoints

Hidden API endpoints can be best understood as the less visible entry points within an application’s API.

They are typically not documented or publicly announced for various reasons, such as being under development, serving internal functions, or possibly because they handle sensitive operations.

Developers may intentionally keep these endpoints obscured to keep them off the radar of potential attackers. However, their existence can still be inferred through methods like contextual discovery and path prediction. While they may not be publicly known, these hidden API endpoints can possess vulnerabilities, and hence, understanding and locating them is a critical aspect of bolstering API security.

TIP: Test with both privileged and unprivileged users

When testing API security, prioritize securing both privileged (highest admin access) and standard user access before diving into endpoint recon.

Here’s why.

Modern web apps, especially SPAs, often employ lazy loading of client-side code to minimize bundle size during download. This means some code won’t be downloaded until used, and you could miss stuff that you have no visibility to. Testing the app in both privileged and unprivileged contexts enhances API endpoint visibility.

The same can be said for paid feature add-ons for the web app. If they exist, it’s likely the code won’t download until the feature activates in the tenant. This is a key area for finding security vulnerabilities, with less code exposure and testing by security researchers who haven’t paid for a license.

Introduction to Path Prediction

Path prediction is a technique that centers around understanding and anticipating the conventions and patterns used in API design.

Developers often follow a logical and consistent structure when creating these paths (sometimes called routes), making them somewhat predictable. This predictability is utilized in path prediction to guess potential hidden endpoints based on the known API structure.

It involves studying the URL structure, endpoint patterns, and other discernible indicators within the API documentation or code. By utilizing path prediction, security researchers can uncover obscured endpoints that may not be immediately apparent, thereby contributing significantly to API security testing.

What is a Path?

It’s important that you understand what a path is. Daniel Miesslar has done a great job of this in his article about The Real Differences Between a URL and a URI. There are a couple of very useful images in his article that help explain it visually:

You can see the path representation there. If we go a bit deeper, you might find this image a bit more helpful:

Paths Prediction in an API Context

Path prediction, in the context of APIs, involves making educated assumptions about an API’s hidden or non-public endpoints based on the known structure.

For instance, if you notice that the API uses a predictable naming convention, like /api/users/get or /api/users/update, you might predict other probable paths like /api/users/delete or /api/users/add.

Of course, developers may use HTTP action verbs instead of separate paths to represent actions. So a POST to /api/users may create a new user, where a GET would retrieve all users. If this structure is observed, it’s worth investigating if a DELETE would delete a user or a PUT to update a user.

This method of prediction based on API structure is particularly useful in penetration testing or security auditing, where the goal is to discover undisclosed or weakly secured endpoints that could be exploited in an attack.

It’s worth noting that while path prediction can be a powerful technique, its effectiveness largely depends on the design consistency and predictability of the API paths, and it should be part of a broader security strategy, not a standalone solution.

The Developer’s Psychology of API Design: REST Resource Naming Conventions

Modern APIs are usually designed around the idea of API first – the concept of designing and building an API before a frontend or backend is even created.

This process puts the emphasis on the model, which will ultimately contain the data that the API will expose.

The way these models are represented in a RESTful context should follow certain conventions, which help to ensure they remain consistent and predictable for any API user.

In a typical REST API, data is primarily represented by a resource. A resource can be a singleton or a collection.

For example, “users” is a collection resource, whereas “user” would be a singleton resource. However, a singleton could also be extracted through an extended path attached to the collection.

i.e., We can identify the “users” collection resource using the URI /api/users. We can identify a single “user” resource using the URI /api/users/{userId}.

The point is to carefully monitor the paths we do see in an API, as it can give us leading indicators to the structure of other endpoints.

API Design Best Practice Patterns You Can Exploit

Most API architecture and design docs refer to several best practices developers should follow in naming their endpoints, which you can take advantage of and lead to predictable paths. Here are just a few that can help with path prediction.

Use a base URI path that is consistent and separate

You will typically find that an API will have a base path of something like /api or /api/v1, unless the API itself is on its own subdomain (e.g. api.domain.com). This helps to manage and version APIs separately from the main application and allows for complex routing. In microservice or serverless APIs, this permits API gateways to route resources to entirely different servers or containers handling the requests.

Be careful when tracking versioning. Sometimes it’s not managed by path but by parameter or header. Here are a few examples:

Path: /api/v1/users
Query Param: /api/users?v=1 or /api/users?api-version=1
Extended Header: /api/users with an extended header that starts with “x-” or “vnd-“, like x-api-header=1 or vnd-api-header=1
Mime Type: /api/users with an Accept header of “application/vnd.mycompany.myapp.users-v1+json“

When considering path prediction, the key is to follow the patterns and check for resources using the expected versioning. It is not uncommon to see different resources using different versioning. When this occurs, it’s a leading indicator that there may be different developers working on it or different languages/frameworks in use.

Use nouns to represent resources

RESTful URIs should refer to a resource that is a thing (noun) instead of referring to an action (verb) because nouns have properties that verbs do not have – similarly, resources have attributes.

When considering path prediction, finding admin endpoints may change the context of access for such resources. As an example, if you know that /api/users allows access for end users to manage their resources, what does /api/admin/users do?

Here are some other possible resources to consider when generating a wordlist for fuzzing admin predictable paths that I have seen in real-world APIs:

administration
adminpanel
backend
config
console
controlpanel
dashboard
maintenance
management
master
operations
privileged
root
superuser
system

Use forward slash (/) to indicate hierarchical relationships

The forward-slash (/) character is used in the path portion of the URI to indicate a hierarchical relationship between resources. e.g.

/api/users
/api/users/{id}
/api/users/{id}/profile

During recon, make sure you map all hierarchical relationships to understand where data can be accessed. Sometimes the same resources can be accessed using different URIs, which may have different access controls or business logic.

Use hyphens (-) to improve the readability of URIs

To make URIs easy for people to scan and interpret, developers are told to use the hyphen (-) character to improve the readability of names in long-path segments. e.g.

/api/users/{id}/registereddevices
/api/users/{id}/registered-devices /* This is much easier to read */

If you ever find objects in a JSON response containing complex names, look to see if they exist in a hierarchical relationship and use hyphens to break it up.

Let me clear that up.

Consider this JSON response when calling GET /api/users/{id}:

{
  "username": "SilverStr",
  "fname": "Dana",
  "lname": "Epp",
  "registeredDevices": [
    {
      "id": "2CCA1777-C0CE-47FD-A33A-C9CD3A047A83",
      "type": "iPhone"
    },
    {
      "id": "C9FED979-CD33-4DBB-A8D7-96D5EFBD90C9",
      "type": "Desktop"
    }
  ]
}

Notice how there is an object array for registeredDevices? Thinking about hierarchical path prediction, consider fuzzing for:

/api/users/{id}/registereddevices
/api/users/{id}/registered_devices
/api/users/{id}/registered-devices

Do not use underscores ( _ )

It’s possible to use an underscore in place of a hyphen to be used as a separator. Depending on the application’s font, it is possible that the underscore (_) character can either get partially obscured or completely hidden in some browsers or screens.

To avoid this confusion, best practices tell developers to use hyphens (-) instead of underscores ( _ ).

/api/users/{id}/registered_devices
/api/users/{id}/registered-devices /* This is much easier to read */

But wait… didn’t I earlier tell you to fuzz for underscores too?

Yep.

Just because it is a best practice not to use them, when it comes to APIs written in languages like Python and Go, it’s common that API routes could be either way.

Always test hyphens first, but don’t ignore the underscore during fuzzing.

Conclusion

Finding hidden API endpoints is an indispensable skill. It opens you up to a new attack surface on the APIs you are testing and exposes you to potentially more critical security vulnerabilities.

By understanding the principles and practices of API design that developers may follow, you can understand their psychology and predict the paths to hidden resources within the API under test.

Combine this with a strong foundation on how HTTP and REST work; you can start to discover how endpoints are built contextually, how data flows, and what permissions are required to access them.

Then, it’s just a matter of figuring out how to abuse them and work in ways not intended. 😈

HTH.

Have fun with it. You’ll be surprised what you may find when thinking like a dev.

One last thing…

If you are into content like this, you’re my kind of peep. I want to invite you to join the API Hacker Inner Circle. It’s my FREE weekly newsletter where I share articles like this, along with pro tips, industry insights, and community news that I don’t tend to share publicly. If you haven’t yet, join us by subscribing at https://apihacker.blog.

Dana Epp

Hey, I’m Dana, aka SilverStr. I build and break software for a living, and am a Microsoft Regional Director and Developer Security MVP. I’ve spent decades as a security architect that focuses on helping secure software, data, and infrastructure on both blue and red teams. As of late, I have been focusing more on my offensive tradecraft to help developers and IT administrators see the impact of exploitation on vulnerabilities in their work. This blog is my chance to give back to the community by sharing my experiences and war wounds from the trenches.