ScrapeNest Docs

Everything you need to make your first request and ship to production.

Quickstart

curl -X POST https://api.scrapenest.dev/v1/scrape/url \
  -H "Authorization: Bearer sn_live_..." \
  -H "Content-Type: application/json" \
  -d '{ "url": "https://example.com", "render_js": false }'

Response is JSON with the extracted text, optional schema-mapped fields, and metadata (cache hit, proxy tier used, response time).

Authentication

Every request needs a bearer token in the Authorization header:

Authorization: Bearer sn_live_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Keys are argon2id-hashed on our side — we only store the hash and the public prefix. Lost the plaintext? Mint a new one in the dashboard and revoke the old.

Credits & cost

Every request consumes credits based on the infrastructure it used:

Operation	Credits	PAYG cost
Plain HTTP (datacenter, no JS)	1	$0.0002
Datacenter + JS rendering	5	$0.0010
Residential proxy, no JS	10	$0.0020
Residential + JS	25	$0.0050
Cache hit (any tier)	1	$0.0002

Pre-built scrapers (Maps reviews, Reddit posts, etc.) document their per-call credit cost on each endpoint page. See /dashboard/billing for plan inclusions.

Endpoints

Generic URL

POST /v1/scrape/url

{
  "url": "https://example.com/page",
  "schema": { "title": "h1", "price": ".price" },
  "render_js": true,
  "wait_for": ".price",
  "proxy_tier": "auto",
  "cache_ttl_seconds": 900
}

Fields:

url — absolute http(s) URL. Required.
schema — map of output key → CSS selector. Omit for raw text only.
render_js — default true. Set false for plain HTTP (cheaper).
wait_for — CSS selector to wait for before extraction (browser mode only).
proxy_tier — auto | datacenter | residential.
cache_ttl_seconds — 0 disables cache; default 900.

The full set of pre-built endpoints (Google Maps, Reddit, AliExpress, Podcasts, …) is in the OpenAPI reference.

Caching

Requests are cached on the full normalized payload (URL + schema + render_js + wait_for + proxy_tier). Identical requests inside the TTL window hit cache and bill at 1 credit regardless of the original tier.

To bypass cache for live data, set cache_ttl_seconds: 0 on the request.

Errors

Status	When
`401`	Missing or invalid bearer token
`422`	Request body failed validation (Pydantic)
`429`	Per-minute or per-day rate limit exceeded
`502`	Upstream failed — credits refunded automatically

SDKs

v1 ships with REST only. Official Python and Node SDKs are slated for v1.1. In the meantime, curl, httpx, and fetch all work cleanly — see the code samples on the homepage.

Stuck? Email [email protected] or grab us in the Slack channel that ships with Pro+ plans.