ScrapeNest Docs
Everything you need to make your first request and ship to production.
Quickstart
Sign up at /signup, copy the API key once shown, and make your first request:
curl -X POST https://api.scrapenest.dev/v1/scrape/url \
-H "Authorization: Bearer sn_live_..." \
-H "Content-Type: application/json" \
-d '{ "url": "https://example.com", "render_js": false }'
Response is JSON with the extracted text, optional schema-mapped fields, and metadata (cache hit, proxy tier used, response time).
Authentication
Every request needs a bearer token in the Authorization header:
Authorization: Bearer sn_live_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Keys are argon2id-hashed on our side — we only store the hash and the public prefix. Lost the plaintext? Mint a new one in the dashboard and revoke the old.
Credits & cost
Every request consumes credits based on the infrastructure it used:
| Operation | Credits | PAYG cost |
|---|---|---|
| Plain HTTP (datacenter, no JS) | 1 | $0.0002 |
| Datacenter + JS rendering | 5 | $0.0010 |
| Residential proxy, no JS | 10 | $0.0020 |
| Residential + JS | 25 | $0.0050 |
| Cache hit (any tier) | 1 | $0.0002 |
Pre-built scrapers (Maps reviews, Reddit posts, etc.) document their per-call credit cost on each endpoint page. See /dashboard/billing for plan inclusions.
Endpoints
Generic URL
POST /v1/scrape/url
{
"url": "https://example.com/page",
"schema": { "title": "h1", "price": ".price" },
"render_js": true,
"wait_for": ".price",
"proxy_tier": "auto",
"cache_ttl_seconds": 900
}
Fields:
url— absolute http(s) URL. Required.schema— map of output key → CSS selector. Omit for raw text only.render_js— defaulttrue. Setfalsefor plain HTTP (cheaper).wait_for— CSS selector to wait for before extraction (browser mode only).proxy_tier—auto | datacenter | residential.cache_ttl_seconds— 0 disables cache; default 900.
The full set of pre-built endpoints (Google Maps, Reddit, AliExpress, Podcasts, …) is in the OpenAPI reference.
Caching
Requests are cached on the full normalized payload (URL + schema + render_js + wait_for + proxy_tier). Identical requests inside the TTL window hit cache and bill at 1 credit regardless of the original tier.
To bypass cache for live data, set cache_ttl_seconds: 0 on the request.
Errors
| Status | When |
|---|---|
401 | Missing or invalid bearer token |
422 | Request body failed validation (Pydantic) |
429 | Per-minute or per-day rate limit exceeded |
502 | Upstream failed — credits refunded automatically |
SDKs
v1 ships with REST only. Official Python and Node SDKs are slated for v1.1. In the meantime, curl, httpx, and fetch all work cleanly — see the code samples on the homepage.
Stuck? Email [email protected] or grab us in the Slack channel that ships with Pro+ plans.