Scrape
Operations
Monitoring, retries, and scaling for Scrape
Monitoring
- Sentry is required; job fails if the DSN is missing.
- Logs include per-clinic progress and per-table row counts.
Retries and backfills
- Playwright login and report downloads retry on transient failures.
- Rerun the Cloud Run Job for backfills; writes are idempotent.
Resource usage
- Cloud Run Job resources: 8 CPU, 16Gi memory, 2-hour timeout.
- Clinic processing runs concurrently with a semaphore (5).
Last updated on