February 27, 2024

5 mistakes beginners make during app recon

App recon is the critical first phase in API security testing, embodying the meticulous art of intelligence gathering.

Dubbed “walking the app,” this is a foundational step of reconnaissance. It allows you to map the terrain, revealing the intricate web of endpoints that form the lifeblood of any application.

By methodically examining how an application works, you not only uncover the expected functionalities but also pry open potential crevices for vulnerabilities. These are the hidden doorways through which security breaches can erupt, sometimes with devastating consequences.

With an ever-expanding digital ecosystem, understanding the importance of app recon has never been more paramount.

So let’s explore five mistakes beginners usually make during app recon.

Mistake #1: They don’t properly prepare for reconnaissance

You’d think it would go without saying that you need to use the right tools. But when I talk to beginners, they think the best way to walk the app is to simply use it.

It’s a good attitude. But misses a key point.

You must record what you’re doing so you can look back and understand how the app functions.

One way to do this is to leverage the browser’s devtools to track everything and then export it to an HTTP archive (HAR) file. I’ve shown how to do this before in reference to generating your own API docs.

Another more preferred method is to use Burp Suite’s built-in browser, which automatically proxies all requests directly through it. This also generates a full site map and activates the passive web vulnerability scanner built into Burp Suite Professional to spider the site as you use it.

Whatever method you use, prepare yourself in advance and make sure you are recording all requests and responses. This will help to download and save all the client side code, cookies, and session data while allowing you to collect all the routes, endpoints, parameters, and JSON data to help you get a good picture of what is going on.

Mistake #2: They aren’t identifying the entire surface area

It’s surprising how many beginners miss out on a whole bunch of opportunities simply by not identifying the full attack surface of the application.

Sure you want to use the main functionality. But are you seeing ALL of the functionality available to you?

Many modern frontend frameworks use the idea of “code splitting” to lazy load code and functionality. This improves performance by not downloading app components until needed. So it’s important to make sure you click on every link, feature, and function that you can.

But that’s not enough.

Paid functionality

Consider things like paid functionality. Are there features not available in your current license to the application that might be lit up if you paid for a higher-tier license or subscription?

Chances are, even just signing up for a trial might be enough to get you access to more of the application, opening up an entirely new attack surface.

It’s always a good idea to use the highest tier of the application you are testing and enable all features and functions that you can. This will help expose you to the largest surface area of the application.

Admin functionality

Alongside using as much functionality as you can as the end user, you also want to get access as the highest-level administrator available to you. In many cases, admin functionality is managed in an entirely different code execution path that can expose significantly more surface area.

Just like the end user testing, you want to make sure you enable as many features and functions as possible to expand the surface area. Just remember once you enable a new feature, check how that impacts the user experience for the standard users in the tenant. You might very well light up new functionality that they won’t normally be able to see.

Undocumented APIs

As you discover the sitemap and log all the API calls the frontend executes, know that there may still be more surface area. It’s important to fuzz the API to discover potentially hidden or undocumented endpoints that aren’t called under normal operations.

For example, using feature flags or staged rollouts may introduce new functionality that you may not yet have access to but that other tenants do. By understanding the structure of how an API is being used, you might be able to use patterns to predict this.

I discussed this in my article about Finding Hidden API Endpoints Using Path Prediction.

My point is, don’t just assume you’ve seen it all. Always look for indicators of more functionality. If you can, track the release notes of an application. Use tools like cewl to help with wordlist generation using those release notes. You can follow my guidance on a “cewl” way for API discovery to learn more on how to do this.

Mistake #3: They aren’t analyzing application behavior

Beginners tend to gravitate to looking at the core functionality of the application. But they don’t always look closely enough to understand what the application is doing behind the scenes, nor how functionality in one area may impact another.

As an example, while an application may use paging to limit how much data is coming back on each request, beginners may not stress test the app to see what happens when a parameter is removed or changed and how it impacts how the app functions.

Understanding the impact of such changes helps to track application flow. Could removing a paging parameter bring back the entire recordset? Could it make the application unresponsive as it tries to query and pull back all the data? Or does it bring back a known set of data? What if you modify it to bring back an unreasonable number of records? How does that impact application behavior?

Analyzing application behavior goes beyond just parameter poisoning. Discovery is just as important. Burp extensions like param-miner can help identify hidden, unlinked parameters you may not see by simply walking the app. This can be used in web cache poisoning attacks and help educate you on how API inputs might alter application behavior. It also can modify how an application may function and what API calls get made.

Mistake #4: They aren’t analyzing authentication and session management

It’s no secret that one of the primary security issues typically found in APIs relates to authentication (authN) and authorization (authZ). So how the application itself handles authN and authZ is just as important.

Beginners sometimes gloss over the fact that while there may be decent authentication and authorization showing on the front end, it might be lacking on the back end.

This can come in many forms.

Token misuse

First off, how are API requests authorized? Is the front end using access tokens? Are they in the form of JSON Web Tokens (JWT) or some other opaque data? How are these tokens obtained?

Once obtained, how are they used? Are they passed in by an authorization header, some other header, or maybe through a cookie?

Do the tokens include claims that represent access to functionality? This may be through roles, scopes, or even custom claims. I’ve talked about this before when writing about how to find access control issues in APIs.

Can tokens be manipulated to change behavior? How does the app function if a token is altered? Deleted? Replaced?

Knowing all this is extremely important.

Missing or Mismatched AuthZ

As developers add new API endpoints, they sometimes forget to add logic to authorize requests. If they aren’t using authorization middleware that helps to enforce it, as they build out the API they will remove any logic that may slow down their development and testing.

Beginners sometimes forget to check for this. This is really too bad, as there are Burp extensions like Autorize that can help automate the checks for this.

I’ve shown how to use extensions like Autorize to find authZ issues before. When set up correctly, it lets you walk the app and automatically find these issues without doing any extra work.

Cross Tenant Data Leakage

A key part of session management, especially in multi-tenant environments, is making sure that there is no “bleed” between users of the system. Not just individual users within a single tenant but also between the tenants themselves.

This is called cross tenant data leakage (CTDL)… or sometimes just “tenant bleed”. You can read my article on CTDL to see several real-world examples that have impacted cloud providers like Microsoft, Amazon, and Google.

It’s a real problem that many beginners miss.

Mistake #5: They aren’t mapping how data is handled

It’s not uncommon to find beginners missing the holy grail of app recon… discovering how APIs handle data.

It comes in many forms.

Like the structure and schema of objects being passed around. Or how the front end may include logic for input validation that the API itself may not. Let’s not forget the fact that security controls like rate limiting may impact how much data can be pulled back from an API. Or the fact an API might return too much data that developers may filter out in the front end code, far too late in the process.

Determining how the app passes data around is vital during recon.

Follow data patterns

Developers are creatures of habit. Discovering the patterns they use to handle data can help inform how you construct your attack payloads.

As an example, do they access data through query parameters on a resource like https://api.example.com/users?name=dana or do they use a more RESTful URI structure like https://api.example.com/users/dana that uses route paths as the params?

Watch how data is structured

Do they use a specific Content-Type header to handle data structure? Can you modify that and change the behavior of the app? When I showed you how to attack APIs by tainting data in weird places, I was exploiting the fact that sometimes developers rely on frameworks that may try to convert from one content type to another. While the developer was expecting JSON, what happens if you send it XML?

It can get worse. Beginners sometimes miss that you can sometimes exploit APIs by using Structured Format Injection (SFI) because they don’t look for the relationship between data inputs that can be tainted with how that data may be sent to the backend APIs.

It’s easy for beginners to gloss over all this during app recon, which is really too bad. It can make the whole process of vulnerability research and attack payload construction much easier when you understand how data is created, manipulated, and shared within the system.

Conclusion

What do you think? Do any of these mistakes seem familiar to you? Do they resonate with your own experiences?

It does for me. I’ve made my share of mistakes, including every one of these at some point in the past. Luckily, I’ve continuously refined my hacking methodology by implementing processes and tools to help improve my app recon.

Hopefully, this article will spark some internal introspection and thought for you on how to improve your own methodology. Or, at the very least, give you an internal checklist to verify that you are avoiding these mistakes.

Hack hard!

One last thing…

Have you joined The API Hacker Inner Circle yet? It’s my FREE weekly newsletter where I share articles like this, along with pro tips, industry insights, and community news that I don’t tend to share publicly. Subscribe at https://apihacker.blog.

Dana Epp

Hey, I’m Dana, aka SilverStr. I build and break software for a living, and am a Microsoft Regional Director and Developer Security MVP. I’ve spent decades as a security architect that focuses on helping secure software, data, and infrastructure on both blue and red teams. As of late, I have been focusing more on my offensive tradecraft to help developers and IT administrators see the impact of exploitation on vulnerabilities in their work. This blog is my chance to give back to the community by sharing my experiences and war wounds from the trenches.