Nest Engineering Docs
Scrape

Overview

Python 3.13 EzyVet CSV scraper into Cloud Spanner

Scrape is a Python 3.13 Cloud Run Job that logs into EzyVet via Playwright, downloads CSV reports, transforms them with Polars, and upserts normalized records into Cloud Spanner. It only runs for clinics flagged for scraping.

Job profile

FieldValue
Codejobs/scrape/
Packagescrape
RuntimePython 3.13 (Cloud Run Job)
StatusLegacy (maintained)
Primary ownerJoe Pardi
Secondary ownerAkansh Divker
TriggerCloud Run Job (manual or scheduler)
Data sourceEzyVet UI reports + Spanner metadata
Data sinkCloud Spanner

Purpose

  • Pull CSV reports from EzyVet for scrape-enabled clinics.
  • Normalize report data into Nest's Spanner schema.
  • Provide a legacy ingestion path for clinics without API-based ingestion.

Non-goals

  • Onboarding new orgs or clinics (scrape is not expanding).
  • Real-time ingestion or incremental API syncing.
  • Replacing the Handler ETL pipelines.

Lifecycle notes

  • Scrape is intended to be phased out.
  • Migration to Handler's ezyVet pipelines is non-trivial, so the job will be maintained until migration is complete.

Inputs and outputs

  • Inputs: Spanner Clinics/Organizations, EzyVet credentials, EzyVet CSV exports.
  • Outputs: Spanner tables (households, contacts, patients, appointments, invoices, invoice lines, team members).

Last updated on