Examples
Five practical Apify examples: single URL scrape, full crawler run, dataset retrieval, scheduled actor runs, and task-based scrape jobs.
Example 1: Scrape a Product Page and Extract Structured Data
{
"resource": "actor",
"operation": "scrapeSingleUrl",
"api_token": "{{ credentials.apify_token }}",
"url": "{{ vars.competitor_product_url }}",
"wait_for_selector_css": ".product-price",
"extract_html": false,
"extract_text": true,
"take_screenshot": true,
"screenshot_width": 1280
}
Expected outcome: The success port fires with
text containing the full page text and screenshotBase64 containing a PNG of the page. Connect a DataMapping node to parse out the price and title using regex or string expressions.Example 2: Run a Full Website Crawler Actor and Poll for Completion
// Step 1 — actor/run node
{
"resource": "actor",
"operation": "run",
"api_token": "{{ credentials.apify_token }}",
"actor_id": "apify/website-content-crawler",
"memory_mb": 1024,
"timeout_seconds": 600,
"input_json": "{\"startUrls\":[{\"url\":\"{{ vars.target_website }}\"}],\"maxCrawlDepth\":3,\"maxCrawlPages\":200}"
}
// Step 2 — Delay node (30 seconds)
// Step 3 — actorRun/getRun node (loop until status !== "RUNNING")
{
"resource": "actorRun",
"operation": "getRun",
"api_token": "{{ credentials.apify_token }}",
"run_id": "{{ nodes.step1.id }}"
}
Expected outcome: After the loop exits with
status = "SUCCEEDED", route to dataset/getItems using defaultDatasetId from the run object to retrieve all crawled pages.Example 3: Get Items from a Completed Dataset Run
{
"resource": "dataset",
"operation": "getItems",
"api_token": "{{ credentials.apify_token }}",
"dataset_id": "{{ vars.dataset_id }}",
"clean": true,
"limit": 100,
"offset": 0,
"fields": "title,price,url,sku"
}
Expected outcome: The success port returns an
items array of up to 100 objects, each containing only the four requested fields. Feed this into a Loop node to process each product record individually.Example 4: Schedule Actor Runs and Process Results
// Triggered by ScheduledTrigger (daily 06:00)
// Apify node — actor/runAndGetDatasetItems
{
"resource": "actor",
"operation": "runAndGetDatasetItems",
"api_token": "{{ credentials.apify_token }}",
"actor_id": "apify/cheerio-scraper",
"memory_mb": 512,
"timeout_seconds": 300,
"max_items": 500,
"input_json": "{\"startUrls\":[{\"url\":\"{{ vars.pricing_page }}\"}],\"pageFunction\":\"async ({ $, request }) => ({ product: $('h1').text(), price: $('.price').text(), url: request.url })\"}"
}
// Loop node over items array
// → IfCondition: items[i].price !== vars.last_known_prices[items[i].product]
// → MongoDB/insertOne to record price change with timestamp
Expected outcome: Each morning, fresh pricing data is collected and only changed prices are written to MongoDB, building a historical price change log for analytics.
Example 5: Use actorTask for Pre-configured Scrape Jobs
{
"resource": "actorTask",
"operation": "runAndGetDatasetItems",
"api_token": "{{ credentials.apify_token }}",
"task_id": "{{ vars.apify_task_id }}",
"max_items": 200
}
// Override task input for this specific run only (optional):
{
"resource": "actorTask",
"operation": "runAndGetDatasetItems",
"api_token": "{{ credentials.apify_token }}",
"task_id": "{{ vars.apify_task_id }}",
"input_json": "{\"startDate\":\"{{ vars.report_start_date }}\",\"endDate\":\"{{ vars.report_end_date }}\"}",
"max_items": 1000
}
Expected outcome: The task runs with its saved actor configuration. The first variant uses the saved input as-is. The second variant passes a date range override — useful for parameterised report generation without exposing full actor configuration details to the workflow.