Nest Engineering Docs
Scrape

Schedule and triggers

Scheduling, triggers, and concurrency for Scrape

Schedule

  • Runs as a Cloud Run Job named scrape.
  • Schedule is external (Cloud Scheduler or manual trigger).
  • Expected runtime varies by clinic count; Cloud Run timeout is 2 hours.
  • No new orgs or clinics should be added to scrape.

Triggers

  • Manual execution via Cloud Run Jobs.
  • Optional scheduler-based triggers (not defined in this repo).

Concurrency and idempotency

  • Job-level parallelism is 1; clinic processing is limited by a semaphore (5).
  • Spanner writes use insert_or_update mutations for idempotency.

Last updated on