Nest Engineering Docs
Event Logging

Operations

Deployments, auth posture, and operational behavior for Event Logging

Deployments

  • Build from services/events/Dockerfile.
  • Cloud Build pipeline: ci/cloudbuild/services-events.yaml
  • Cloud Deploy manifests:
    • services/events/deploy/skaffold.yaml
    • services/events/deploy/service.yaml
  • Environment registration and rollout parameters:
    • infra/envs/dev/cloud_build.tf
    • infra/envs/prod/cloud_build.tf
    • infra/envs/dev/cloud_deploy.tf
    • infra/envs/prod/cloud_deploy.tf

Cloud Run shape

The service manifest currently configures:

  • Cloud Run IAM authentication
  • h2c on port 8080
  • 1 vCPU / 512Mi
  • containerConcurrency: 250
  • timeoutSeconds: 120

Startup behavior

On startup the managed runtime:

  1. Resolves the active GCP project id.
  2. Creates and warms the Spanner connection pool.
  3. Reads the Statsig server key from Secret Manager.
  4. Initializes the shared Statsig client.
  5. Rebuilds the Connect app with the managed repository and forwarder.

If any of those steps fails, startup fails and the process does not begin serving requests.

Internal-only auth posture

  • run.googleapis.com/authentication: "iam" in the Cloud Run manifest is the enforcement point.
  • AuthInterceptor only records caller identity for service-managed audit metadata and logs.
  • Authorized callers still need Cloud Run roles/run.invoker and a Google- signed ID token for the service URL.

When changing auth behavior, update the app code and the deploy/IAM files together. Do not treat AuthInterceptor as the primary access-control layer.

Logs and failure signals

The service emits structured log events under the services.events logger. The stable event names are:

  • events.log_event.request_accepted
  • events.log_event.request_rejected
  • events.log_event.write_succeeded
  • events.log_event.write_failed
  • events.log_event.forward_succeeded
  • events.log_event.forward_failed

There is a metrics interface in the service layer, but the default runtime still uses a no-op implementation until a concrete backend is wired in.

Important failure modes

  • Validation or normalization failures return INVALID_ARGUMENT.
  • Spanner write failures return INTERNAL.
  • Statsig forwarding failures after a successful write return UNAVAILABLE, but the row is already stored in Spanner.
  • Because idempotency_key is correlation-only in v1, retrying after UNAVAILABLE can create duplicates.

Last updated on

On this page