Skip to main content
Use Batch Scraping to submit up to 10,000 URLs in a single API call. This is useful when you need to extract the same type of data from a large number of pages — for example, scraping product pages or company profiles at scale. Batch Scrape works by sending an array of requests under the requests field.

When to Use Batch Scraping

  • Extracting titles, authors, and publish dates from a list of blog or news article URLs
  • Running aiExtractRules across a set of company websites to collect structured data like founding year, services, and contact info
  • Gathering legal notices or disclaimers from the footer pages of 5,000+ policy URLs

Submit a Batch Scrape Job

curl --location 'https://api.hasdata.com/scrape/batch/web' \
--header 'Content-Type: application/json' \
--header 'x-api-key: <your-api-key>' \
--data '{
  "requests": [
    {
      "url": "https://hasdata.com",
      "outputFormat": ["text", "html"],
      "aiExtractRules": {
        "company": { "type": "string" },
        "email": { "type": "string" },
        "yearFounded": { "type": "number" },
        "isHiring": { "type": "boolean" }
      }
    },
    {
      "url": "https://example.com",
      "outputFormat": ["text", "html"],
      "aiExtractRules": {
        "company": { "type": "string" },
        "email": { "type": "string" },
        "yearFounded": { "type": "number" },
        "isHiring": { "type": "boolean" }
      }
    },
  ]
}'

Response

{
  "jobId": "9a35f32e-4f9c-4d49-9c6e-7c4de4a091e0",
  "status": "ok"
}
This means the batch job was accepted and is being processed asynchronously.

Get Job Status & Results

To check the status of your batch job:
curl --request GET \
  --url 'https://api.hasdata.com/scrape/batch/web/9a35f32e-4f9c-4d49-9c6e-7c4de4a091e0' \
  --header 'x-api-key: <your-api-key>'
To retrieve results once ready (supports pagination):
curl --request GET \
  --url 'https://api.hasdata.com/scrape/batch/web/9a35f32e-4f9c-4d49-9c6e-7c4de4a091e0/results?page=1&limit=100' \
  --header 'x-api-key: <your-api-key>'

Example Result

{
  "page": 0,
  "limit": 100,
  "total": 2,
  "results": [
    {
      "query": {
        "url": "https://example.com",
        "outputFormat": [
          "text",
          "html"
        ],
        "aiExtractRules": {
          "company": {
            "type": "string"
          },
          "email": {
            "type": "string"
          },
          "yearFounded": {
            "type": "number"
          },
          "isHiring": {
            "type": "boolean"
          }
        }
      },
      "result": {
        "id": "ca061bb6-64d8-462d-8ee9-16dd5aaf5efe",
        "status": "ok",
        "json": "https://files-dev.hasdata.com/ca061bb6-64d8-462d-8ee9-16dd5aaf5efe.json"
      }
    },
    {
      "query": {
        "url": "https://hasdata.com",
        "outputFormat": [
          "text",
          "html"
        ],
        "aiExtractRules": {
          "company": {
            "type": "string"
          },
          "email": {
            "type": "string"
          },
          "yearFounded": {
            "type": "number"
          },
          "isHiring": {
            "type": "boolean"
          }
        }
      },
      "result": {
        "id": "e6a75ddb-2d3f-40a5-9610-748ec3bd34a2",
        "status": "ok",
        "json": "https://files-dev.hasdata.com/e6a75ddb-2d3f-40a5-9610-748ec3bd34a2.json"
      }
    }
  ]
}

Notes

  • Maximum batch size: 10,000 URLs
  • Failed URLs do not consume credits