Find the Hidden JSON API Behind Any Site
Most modern sites that look like HTML are secretly driven by JSON APIs. Finding that API turns a messy scraping job into reading documentation you didn’t know existed.
Open DevTools → Network → filter Fetch/XHR. Reload the page. What you’re looking for:
- A request whose response is JSON matching what’s on screen
- Query params that look like pagination, search, or filters
- A stable path like
/api/v1/productsrather than a one-off/graphqlquery
Right-click the request → Copy → Copy as cURL. Paste it into curlconverter.com and you have working Python:
import requests
headers = {
"accept": "application/json",
"user-agent": "Mozilla/5.0 ...",
"x-api-key": "...", # often needed
}
resp = requests.get(
"https://example.com/api/v1/products",
params={"page": 1, "limit": 50},
headers=headers,
timeout=10,
)
data = resp.json()
Two things to watch for:
Auth tokens. If there’s an Authorization or x-csrf-token header, it probably came from a cookie or an earlier
request. Reproduce that request first, reuse the session.
Rate limits are often different on the API. Sometimes stricter (the API is the real product); sometimes looser ( because only power users find it). Test with small requests before running at full tilt.
Finding the API is the single highest-leverage move in scraping. It turns hours of HTML wrangling into a 20-line script.