Find the Hidden JSON API Behind Any Site

Most modern sites that look like HTML are secretly driven by JSON APIs. Finding that API turns a messy scraping job into reading documentation you didn’t know existed.

Open DevTools → Network → filter Fetch/XHR. Reload the page. What you’re looking for:

  • A request whose response is JSON matching what’s on screen
  • Query params that look like pagination, search, or filters
  • A stable path like /api/v1/products rather than a one-off /graphql query

Right-click the request → Copy → Copy as cURL. Paste it into curlconverter.com and you have working Python:

import requests

headers = {
    "accept": "application/json",
    "user-agent": "Mozilla/5.0 ...",
    "x-api-key": "...",   # often needed
}

resp = requests.get(
    "https://example.com/api/v1/products",
    params={"page": 1, "limit": 50},
    headers=headers,
    timeout=10,
)
data = resp.json()

Two things to watch for:

Auth tokens. If there’s an Authorization or x-csrf-token header, it probably came from a cookie or an earlier request. Reproduce that request first, reuse the session.

Rate limits are often different on the API. Sometimes stricter (the API is the real product); sometimes looser ( because only power users find it). Test with small requests before running at full tilt.

Finding the API is the single highest-leverage move in scraping. It turns hours of HTML wrangling into a 20-line script.