Scrape

Overview

Python 3.13 EzyVet CSV scraper into Cloud Spanner

Scrape is a Python 3.13 Cloud Run Job that logs into EzyVet via Playwright, downloads CSV reports, transforms them with Polars, and upserts normalized records into Cloud Spanner. It only runs for clinics flagged for scraping.

Job profile

Field	Value
Code	`jobs/scrape/`
Package	`scrape`
Runtime	Python 3.13 (Cloud Run Job)
Status	Legacy (maintained)
Primary owner	Joe Pardi
Secondary owner	Akansh Divker
Trigger	Cloud Run Job (manual or scheduler)
Data source	EzyVet UI reports + Spanner metadata
Data sink	Cloud Spanner

Purpose

Pull CSV reports from EzyVet for scrape-enabled clinics.
Normalize report data into Nest's Spanner schema.
Provide a legacy ingestion path for clinics without API-based ingestion.

Non-goals

Onboarding new orgs or clinics (scrape is not expanding).
Real-time ingestion or incremental API syncing.
Replacing the Handler ETL pipelines.

Lifecycle notes

Scrape is intended to be phased out.
Migration to Handler's ezyVet pipelines is non-trivial, so the job will be maintained until migration is complete.

Inputs and outputs

Inputs: Spanner Clinics/Organizations, EzyVet credentials, EzyVet CSV exports.
Outputs: Spanner tables (households, contacts, patients, appointments, invoices, invoice lines, team members).

Last updated on

Overview

Job catalog and operational entry points

Architecture

Pipeline design and dependencies for Scrape

On this page

Job profile Purpose Non-goals Lifecycle notes Inputs and outputs Related pages