Hacking API discovery with a custom Burp extension

Hacking API Discovery

So a few months back, I introduced you to the idea of weaponizing API discovery metadata to detect and catalog APIs. I concluded the article by saying that as the specification for API discovery metadata isn’t widely adopted yet, the API Discovery Burp extension I wrote won’t find a lot for you… yet.

Well, that wasn’t good enough for me. 

Why run an extension tied into the Burp Web Vulnerability Scanner that won’t find API documentation now, as you need it? So, I decided to improve the code to find API documentation artifacts using a more direct, bruteforced approach.

It’s all BishopFox’s fault. 

Let me explain.

Swagger Jacker and API documentation brute forcing

Earlier this year BishopFox released Swagger Jacker. It’s a tool for auditing endpoints defined in exposed (Swagger/OpenAPI) definition files.

A somewhat hidden (but very valuable) feature in Swagger Jacker is its ability to send a series of requests to find an API definition file on the target. Think of it as a directory or file enumeration with its own custom wordlist, looking for common artifacts found in Swagger and OpenAPI documentation. 

BishopFox’s approach uses a dynamically generated wordlist, which honestly, is better than the standard Swagger.txt found in SecLists. While the SecLists wordlist has about 50 possible locations to look in, Swagger Jacker looks at several hundred permutations.

But I think I can do better (or worse)… depending on how you look at it.

Why API document wordlists typically suck

So here’s the thing. Tools that scaffold and generate API code and documentation usually suck. Mainly because these vendors each decide on their own way to render out the files. So while one tool may put the docs in /apidocs, another may drop them in the root of the server. 

Some tools store them in the old Swagger format in a swagger.json or swagger.yaml file. Others have updated to the more standard OpenAPI OAS 3.x compatible format in openapi.json or openapi.yaml. 

And through all that, different tools use different versioning and pathing so they can be almost anywhere. 

There is just no one way to do it. 

This is why things like the API discovery metadata specifications were built.

What makes it worse is that there is no guarantee the API itself will be at the root of the server. In fact, many times the API routes could exist nested in the web app or even in a separate virtual host (VHOST), which I’ve discussed before. Which is why attack surface mapping is a thing.

So it’s important that you recursively check for API docs as you discover new paths and subdirectories. Unfortunately, most people miss that… and miss finding the API docs.

Swagger Jacker tries to address this by expanding the paths it attempts to discover. But it requires you to manually point it to the (sub)dir you want to brute.

I believe I can do better than that by letting Burp Suite do all the heavy lifting. As it crawls a target to populate the Site Map I can intercept every new subdirectory it discovers with a custom extension and force a scan for API documents in real time.

The trick is to make it performant and resilient to the abuse the extension is going to force on Burp Suite.

Let me show you how I did it.

API Doc Path Enumeration

You can follow along with the Kotlin code I am about to show you in my GitHub repo for my API Discovery extension for Burp Suite. And feel free to contribute pull requests if you can see ways to improve the code as I go through it.

We first need a set of variables that can hold the prefix directories, doc endpoints, and UI endpoints, respectively.

    private val prefixDirs = listOf(
        "",
        "/swagger",
        "/swagger/docs",
        "/swagger/latest",
        "/swagger/v1",
        "/swagger/v2",
        "/swagger/v3",
        "/swagger/static",
        "/swagger/ui",
        "/swagger-ui",
        "/swagger-docs",
        "/api-docs",
        "/api-docs/v1",
        "/api-docs/v2",
        "/apidocs",
        "/api",
        "/api/v1",
        "/api/v2",
        "/api/v3",
        "/v1",
        "/v2",
        "/v3",
        "/doc",
        "/docu",
        "/docs",
        "/docs/swagger",
        "/docs/swagger/v1",
        "/docs/swagger/v2",
        "/docs/swagger-ui",
        "/docs/swagger-ui/v1",
        "/docs/swagger-ui/v2",
        "/docs/v1",
        "/docs/v2",
        "/docs/v3",
        "/public",
        "/redoc"
    );

    private val docEndpoints = listOf(
        "",
        "/index",
        "/swagger",
        "/swagger-ui",
        "/swagger-resources",
        "/swagger-config",
        "/openapi",
        "/api",
        "/api-docs",
        "/apidocs",
        "/v1",
        "/v2",
        "/v3",
        "/doc",
        "/docs",
        "/apispec",
        "/apispec_1",
        "/api-merged"
    );

    private val uiEndpoints = listOf(
        "/swagger-ui-init",
        "/swagger-ui-bundle",
        "/swagger-ui-standalone-preset",
        "/swagger-ui",
        "/swagger-ui.min",
        "/swagger-ui-es-bundle-core",
        "/swagger-ui-es-bundle",
        "/swagger-ui-standalone-preset",
        "/swagger-ui-layout",
        "/swagger-ui-plugins"
    );

I’ve also created variables for extensions to these endpoint paths. Again, different tools render out the files with and without extensions, so we have to account for all these conditions.

    private val docExtensions = listOf(
        "",
        ".json",
        ".yaml",
        ".yml",
        ".html",
        "/"
    );

    private val uiExtensions = listOf(
        "",
        ".js",
        ".html"
    );

All in all, it ends up rendering a dynamic wordlist of over 4,000 possible combinations. It might be overkill, but spraying the additional requests across the target and dealing with the backlash is better. 

If I were being more covert, I would probably alter this to be more targeted based on other platform signals discovered during recon, but as my work doesn’t require that… I’ll be loud and proud.  YMMV of course.

Of course, sending that many requests for every new path discovered has limits. Especially when triggering potential rate limiting. So we need to address that.

Making resilient requests in a Burp extension

So by default, making HTTP requests in a Burp extension doesn’t account for security controls a target may have to throttle requests. So it’s up to us to codify a solution to handle that.

There are plenty of ways to do it. While you could watch for HTTP responses with a 429 status code and adjust your request time based on whatever retry-after metadata may be provided, API servers are not guaranteed to provide that information properly. Especially when a WAF or API Gateway is used.

My solution is simpler and ignores all that. We’ll just use an exponential backoff with an initial delay of 2 seconds and retry a maximum of 3 times. That means the second attempt will be delayed 2 seconds, then 4, and then 8, etc.

I further speed things up by tuning the connection timeout to 3.5 seconds to ensure we move on quickly to the next path in the wordlist to check.

The whole function looks something like this:

private fun resourceExistsWithRetry(url: String, maxRetries: Int = 3, initialDelayMillis: Long = 2000L): Boolean {
        var attempt = 0

        while (attempt < maxRetries) {
            var connection: HttpURLConnection? = null
            try {
                connection = (URI(url).toURL().openConnection() as HttpURLConnection).apply {
                    requestMethod = "GET"
                    connectTimeout = 3500
                    readTimeout = 3500
                    connect()
                }

                val responseCode = connection.responseCode
                if (responseCode == HttpURLConnection.HTTP_OK) {
                    return true
                } else if (responseCode == HttpURLConnection.HTTP_UNAVAILABLE || responseCode == 429) {
                    println("Server returned retryable status code: $responseCode")
                } else {
                    return false
                }
            } catch (e: SocketTimeoutException) {
                println("Request timed out on attempt ${attempt + 1}")
            } catch (e: IOException) {
                println("IOException on attempt ${attempt + 1}: ${e.message}")
            } finally {
                connection?.disconnect() // Ensure connection is properly closed
            }

            // Increase delay exponentially: initialDelayMillis * 2^attempt
            val exponentialDelay = initialDelayMillis * 2.0.pow(attempt).toLong()
            attempt++

            if (attempt < maxRetries) {
                Thread.sleep(exponentialDelay)
            }
        }

        return false
    }

Speeding up API discovery with parallel processing

So it surprises me when I see people write Burp extensions and have the logic run in series. It is painfully slow and just isn’t very efficient. Portswigger doesn’t offer practical examples when writing extensions that leverage multi-threading or parallelism, so I guess I can get it. 

However, the ability to execute code more performantly exists in Burp Suite.  

This was one of the main reasons I was so eager to move from the older Python/Jython extensions to using the new Montoya API, written in Kotlin. I can take advantage of async calls and coroutines.

With coroutines, you avoid callback hell and can write non-blocking code in a natural, linear way.

Here is the async function I wrote that allows Burp Suite to make over 4,000 HTTP requests against a target in less than 10 to 15 seconds (on average), and still be resilient enough to slow down and fall back if the target begins to balk at the enumeration: 

suspend fun enumerateAPIDocPaths(target: String, dispatcher: CoroutineDispatcher = Dispatchers.IO): CheckedPath = coroutineScope {
    val urls: List<String> = generatePathList(target)

    // Run resourceExistsWithRetry in parallel for each URL
    val pathResults = urls.map { url ->
        async(dispatcher) { if (resourceExistsWithRetry(url)) url else null }
    }.awaitAll()
     .filterNotNull()
     .distinct() // Deduplicate the list

    return@coroutineScope CheckedPath(
        target = target,
        apiDocPathDetected = pathResults.isNotEmpty(),
        detectedPaths = pathResults
    )
}

Refactoring the Passive Audit in Burp Suite Scanner

To support the addition of the bruteforce path discovery in my API Discovery extension, I had to refactor the hooks for the passive audit. This allowed me to break out the metadata discovery from the path discovery, and generate individual issues that can be pushed into the Audit Issue history.

The meat of the new discovery code is in conductPathDiscovery():

private fun conductPathDiscovery(baseRequestResponse: HttpRequestResponse?): AuditIssue? {
    val detail = StringBuilder("")

    val target = getTargetPath(baseRequestResponse?.request()?.url() ?: "http://localhost")

    val index = checkedPaths.indexOfFirst { it.target == target }
    var targetPath = if (index != -1) checkedPaths[index] else null

    // Did we already scan this path in the last hour?
    if( targetPath != null && isWithinLastHour(targetPath.lastChecked)  ) {
        return null
    }

    runBlocking {
        targetPath = APIDocPathEnumeration(api).enumerateAPIDocPaths(target)
    }

    if(targetPath?.apiDocPathDetected == true) {
        detail.append(
            "Potential API doc path(s) have been discovered at <a href=\"$target\">$target</a>"
        ).append("<br><br>")

        targetPath!!.detectedPaths?.takeIf { it.isNotEmpty() }?.let { paths ->
            detail.append("Potential doc paths:").append("<br>")
            paths.forEach { path ->
                detail.append(path).append("<br>")
            }
        }

        detail.append("<br>")

        api.logging().logToOutput("Detected potential API doc paths at $target")
    }

    checkedPaths.apply {
        if (index != -1) targetPath?.let { set(index, it) } 
        else targetPath?.let { add(it) }
    }

    if(detail.toString() == "" ){
        return null
    }

    return AuditIssue.auditIssue(
        "API doc path discovered",
        detail.toString(),
        null,
        target,
        AuditIssueSeverity.INFORMATION,
        AuditIssueConfidence.CERTAIN,
        null,
        null,
        AuditIssueSeverity.LOW,
        baseRequestResponse
    )
}

A few things to call out include the fact that I include an arbiter that prevents the constant (re)scanning of paths Burp Suite sees. There is no reason to hammer the API server with thousands of requests unlikely to change within any given hour.

And I run the asynchronous coroutines in a runBlocking{} code segment, which allows me to run suspend functions like enumerateAPIDocPaths() within the blocking code of the ScanCheck within the Montoya API.

The end result: API doc discovery Audit Issues

So with these changes, any time Burp Suite runs a passive audit on an in-scope target, the API Discovery extension will enumerate any new path it sees with several thousand different permutation requests looking for possible API documentation artifacts. If it finds anything, it will generate an Audit Issue which you can find on the dashboard:

You can also find it directly in the Site Map as well:

Conclusion

With these new code changes, the API Discovery extension not only looks for API discovery metadata, but it also bruteforce enumerates for common (and uncommon) paths and files that may signal API documentation is published on the target.

This work is inspired by BishopFox’s Swagger Jacker. It’s a great command line tool you can use if you know the subdirectory you want to scan, and are willing to do it manually. 

However, this extension improves upon Swagger Jacker by automatically scanning EVERY subdirectory the Burp Suite scanner detects during its own web vulnerability scans. You do nothing… until you see an audit issue that points you where to look for the artifacts of the API documentation. This is a more efficient and effective way to do API discovery during your recon process.

I hope you like it.

Think it can be improved? Feel free to send me a pull request with your changes.

Enjoy.

One last thing…

API Hacker Inner Circle

Have you joined The API Hacker Inner Circle yet? It’s my FREE weekly newsletter where I share articles like this, along with pro tips, industry insights, and community news that I don’t tend to share publicly. 
If you haven’t, subscribe at https://apihacker.blog.

Dana Epp

Discover more from Dana Epp's Blog

Subscribe now to keep reading and get access to the full archive.

Continue reading