Observability

Purpose

Ensure systems are observable enough to detect and resolve issues quickly.

Alerts should point responders toward the owning service, likely impact, and a relevant dashboard or runbook.
During incidents, responders should post relevant graphs, logs, traces, hypotheses, and verification steps into the incident channel.
Keep evidence sanitized. Do not log or share payment/card details, credentials, keys, raw secrets, or unredacted customer-sensitive data.
Dashboards used for incident decisions should be reliable enough that someone outside the owning team can understand the signal during a handoff.
If an incident exposes missing telemetry, create a Linear follow-up rather than relying on memory.