Reverse Engineering APIs

Introduction

Let’s say you want to automate your interactions with a specific web application, but unfortunately, there’s no publicly available API to rely on. What can you do? It might be worth investigating the application’s endpoints to understand how you can interact with it by creating your own custom API.

That’s the primary focus of this article. We’ll walk through a hypothetical scenario where I’ll explain the basics of exploring an application’s endpoints, understanding their flow, and coding your own interactions to build a functional API.

Here’s the updated version with the Prerequisites section added:

Prerequisites

Before diving into this article, I assume you have a basic understanding of HTTP request methods such as GET, POST, and PUT. Additionally, familiarity with using the curl command-line tool is essential, as it will be used extensively throughout our examples. If you are not yet comfortable with these concepts, I recommend reviewing them before proceeding.

The Scenario

Our hypothetical scenario begins with a website called “example.com”. This web application allows users to download music from selected music streaming platforms, offering the option to choose the desired audio quality. Once the music is selected, it is archived in a file hosting service of your choice—for this example, let’s use Buzzheavier . The application then provides a direct Buzzheavier link to download the archived files.

We want to understand how we can interact with “example.com” programmatically—specifically, how to use HTTP request methods to leverage the functionality of this application.

Analyzing HTTP Requests

To achieve this, we will use browser developer tools and curl. First, we’ll make a GET request to gather the necessary information and examine the responses. By checking the Network tab in the browser’s developer tools, we can observe the interactions between the browser and the application.

In our case, we notice a POST request being sent. This request includes a JSON payload with specific fields that need to be populated, such as account, host, quality, and url. Now, let’s replicate this POST request using curl.

curl -X POST https://example.com/api/ \
-H "Content-Type: application/json" \
-d '{"url":"https://xxxxxx","quality":"9","host":"buzzheavier","account":"XX"}'

Explanation

-X POST: Specifies the HTTP method as POST.
https://example.com/api/: The URL to which the request is sent.
-H "Content-Type: application/json": Sets the Content-Type header to indicate the payload is in JSON format.
-d '{"url":"..."}': Sends the JSON payload as the request body.

Checking the Status

After sending the POST request to “example.com/api” with the required data, we receive a JSON response containing an id:

{"id": xxxxxx}

Using this id, we can check the status of the download. To do this, we send a GET request as follows:

curl "https://example.com/api?id=xxxxxx"

This request returns a JSON response with various details about the download process. While the response contains multiple fields, we are primarily interested in the status and url fields:

{
  "cover": "https://xxxxxx/cover.jpg",
  "current": "xxxxx",
  "error": null,
  "id": xxxxxx,
  "progress": 1,
  "sourceURL": "https://xxxxxx",
  "status": "done",
  "total": 1,
  "url": "https://buzzheavier.com/xxxxxxxxxxx"
}

When the status field is “done,” the url field will contain the Buzzheavier link to the archived file.

Summary of Steps

Send a POST request with the required data to initiate the process.
Retrieve the id from the response.
Use the GET request with the id to monitor the status and retrieve the download link once ready.

Downloading the File

Once a Buzzheavier link is provided, we can interact with it to download the archived contents. Here’s how you can use the curl command:

curl -L -o "downloaded_file.zip" https://buzzheavier.com/xxxxxxxxxxx/download

Explanation of the Command

-L: This flag tells curl to follow redirects. Some download URLs might redirect to a different location before serving the file, and -L ensures that curl follows these redirects to successfully download the file.
-o "downloaded_file.zip": This specifies the name of the output file where the downloaded content will be saved. In this case, the file will be saved as downloaded_file.zip.
https://buzzheavier.com/xxxxxxxxxxx/download: The URL is the download link obtained from the previous step. Replace xxxxxxxxxxx with the actual identifier provided in the JSON response.

How We Discovered This

We used a similar process as with the previous web application:

Observed the network activity using browser developer tools while initiating a download from Buzzheavier.
Noted the final URL the browser used to fetch the file, including any redirects.
Replicated the request in curl, ensuring we used the -L flag to handle redirects properly.

Automating with FastAPI

With the information we’ve gathered so far, it’s clear that we can automate this process using any programming language or framework. For practical purposes, I’ve created a FastAPI application to demonstrate how this can be done efficiently. The following code handles:

Starting the download process by sending a POST request with the required data.
Checking the status of the download using a GET request with the received id.
Retrieving and saving the final file using the provided download link.

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import requests

app = FastAPI()

# Define Pydantic model for request body
class DownloadRequest(BaseModel):
    url: str
    quality: str = "9"
    host: str = "buzzheavier"
    account: str = "XX"

class FileDownloadRequest(BaseModel):
    url: str
    file_name: str

API_BASE_URL = "https://example.com/api"

@app.post("/start-download")
def start_download(request: DownloadRequest):
    """
    Starts a download by sending a POST request with the necessary payload
    to the example.com API.
    """
    payload = request.dict()

    # Send the POST request to the API
    response = requests.post(API_BASE_URL, json=payload)
    
    if response.status_code != 200:
        raise HTTPException(
            status_code=response.status_code,
            detail="Failed to start the download"
        )
    
    data = response.json()
    download_id = data.get("id")
    if not download_id:
        raise HTTPException(
            status_code=400,
            detail="No ID received from the API"
        )

    return {"id": download_id}


@app.get("/get-status/{download_id}")
def get_status(download_id: int):
    """
    Retrieves the status of a download by sending a GET request to the example.com API.
    """
    status_url = f"{API_BASE_URL}?id={download_id}"
    
    response = requests.get(status_url)
    
    if response.status_code != 200:
        raise HTTPException(
            status_code=response.status_code,
            detail="Failed to retrieve the download status"
        )
    
    return response.json()

@app.post("/download-file")
def download_file(request: FileDownloadRequest):
    """
    Downloads a file from the given URL and saves it with the provided file name.
    """
    url = request.url
    file_name = request.file_name
    download_url = f"{url}/download"

    response = requests.get(download_url, stream=True)

    if response.status_code != 200:
        raise HTTPException(
            status_code=response.status_code,
            detail="Failed to download the file"
        )
    
    with open(file_name, "wb") as f:
        for chunk in response.iter_content(chunk_size=8192):
            if chunk:
                f.write(chunk)
    
    return {"message": f"File {file_name} downloaded successfully."}

Conclusion

With this implementation complete, we can now interact with our custom FastAPI API to automate tasks like downloading music. For this particular case, we could even enhance the system further by integrating features such as a queue to manage multiple downloads efficiently.

The key takeaway here is understanding how to analyze and utilize HTTP request methods to interact with web applications programmatically. By observing network activity and mimicking the application’s API calls, we unlock new ways to automate and streamline workflows that would otherwise require manual interaction.