Guide

What is a hidden API, and why it is the most reliable way to get data

When a page loads prices, listings, or search results, that data usually does not live in the HTML — it arrives from an internal JSON endpoint the page calls in the background. We call that a hidden API. For public, factual data, reading it is almost always more reliable than parsing the rendered page.

What "hidden" actually means

Nothing nefarious. A modern web page is a thin shell that, once loaded, calls one or more of its own backend endpoints to fetch the real content as structured JSON. These endpoints are "hidden" only in the sense that they are not documented for outside use and not shown in the address bar — but your own browser uses them every single time you open the page. They are part of how the public site delivers its public data to you.

The key insight: the JSON the page receives is cleaner, more stable, and more complete than the HTML it eventually renders. Visual layout changes constantly; the underlying data contract changes far less often.

Why calling it beats parsing HTML

  • Stability. A redesign can rewrite every CSS class and break an HTML scraper overnight. The JSON field price tends to stay price through redesign after redesign. Fewer moving parts means fewer 2 a.m. failures.
  • Cleaner data. You get typed fields — numbers as numbers, dates as dates, nested objects — instead of scraping text out of formatted markup and re-parsing it. Less guesswork, fewer edge-case bugs.
  • Completeness. The endpoint often returns more than the page shows: extra attributes, stock levels, identifiers, pagination metadata. You frequently get richer data than what is visible.
  • Efficiency. One JSON call can return what would otherwise take rendering a full page with images and scripts — which means a lighter, more respectful footprint on the source site.

How to find a hidden API (DevTools, Network, XHR)

You can do this yourself in any browser, on any public page, in a couple of minutes:

  1. Open the page and press F12 (or right-click → Inspect) to open DevTools.
  2. Go to the Network tab and filter by Fetch/XHR. This hides images and scripts so you see only data requests.
  3. Reload the page, or trigger the action you care about — paginate, search, open a listing.
  4. Watch the requests appear. Click the ones that return JSON and look at the Response / Preview tab.
  5. When you spot the request whose response contains your fields — the prices, the listings, the search results — you have found the hidden API. Note its URL, method, and the parameters it takes.

From there, the work is reading that endpoint reliably: correct parameters, sensible pagination, a respectful request rate, and handling for the cases where the site expects a real browser session first. That last part is exactly what our guide on Cloudflare-protected public sites covers.

We keep a working starting point in the open: see the hidden-api-extraction-template repository for a clean structure you can build on.

Doing it the compliant way

A hidden API is still the site's infrastructure, so the same rules apply as to any access of a public source:

  • Read the site's Terms of Service and respect them. If the terms forbid automated access, that is a stop sign.
  • Honor robots.txt and any rate limits. Pace your requests; do not hammer the endpoint.
  • Stick to public, factual, non-PII data — catalog and listing fields, not personal information and not anything behind a login.
  • You operate and own the resulting feed. We build it to be a good citizen of the source.
We never frame this as getting around anything. A hidden API is simply the most reliable place to read public data that the site already serves to every visitor — used responsibly, within terms and rate limits.

Not sure if a site has a usable hidden API?

Send us the public URL and the fields you need. Our Hidden-API Audit tells you quickly whether a clean endpoint exists — and what a feed on it would look like.

Get a free feasibility check