Nest Engineering Docs
Datalake

Configuration

Runtime configuration and environment variables for Datalake

Required environment

export GCP_SPANNER_PROJECT_ID="..."
export GCP_BIGQUERY_PROJECT_ID="..."
export BQ_DATASET_ID="..."

Optional environment

export DEBUG="true"
export SENTRY_ENVIRONMENT="development"
export LOOKBACK_HOURS=0
export MAX_MESSAGES=20
export MAX_BYTES=$((5 * 1024 * 1024))
export ACK_DEADLINE_SECONDS=60
export ACK_DEADLINE_EXTENSION_SECONDS=30
export BQ_TABLE_ID_SUFFIX="_changelog"
export HASH_DELIMITER="|"
export DEFAULT_SOURCE_SYSTEM="NEST"

Pub/Sub configuration

export SOURCE_TOPIC_NAME="nest-spanner-datalake-stream"
export SOURCE_SUBSCRIPTION_NAME="nest-spanner-datalake-stream-subscription"

Notes

  • GCP_SPANNER_PROJECT_ID is resolved from metadata when running on GCP.
  • ADC is required to access BigQuery, Pub/Sub, and Secret Manager.

Last updated on