← All dashboards
⚙️
Ops
Operations
Pipeline health, data freshness SLAs, infra cost, and on-call incident timelines for your data org.
⚠️
This dashboard uses synthetic data to preserve client confidentiality. Layout, metrics, and interactions match real production deployments.
DAGs healthy
212/214
▲ 4
SLA hit
99.2%
▲ 1.4 pp
Pages (30d)
21
▼ 78%
MTTR
4m 12s
▼ 32m
Trend — last 30 daysApr 22 — auto-retry classifier promoted to prod
Mix breakdown
Cohort breakdown
Score distribution
Health gauge
Outlier scan
Top failing DAGs (30d)
| DAG | Fails | Self-heal | Owner |
|---|---|---|---|
| events_etl | 14 | 12 | data-platform |
| stripe_sync | 11 | 11 | fin-eng |
| mixpanel_load | 9 | 8 | growth-eng |
| ml_features_v2 | 7 | 4 | ml-platform |
| churn_score | 6 | 6 | ml-platform |
AI insight
📌 What this dashboard reveals
Auto-retry is handling 85% of failures without paging. The 3 remaining noisy DAGs (events_etl, stripe_sync, mixpanel_load) are queued for refactor.
— Reporter Agent · operations · synthetic data
Charts use illustrative mock data. Real client deployments are confidential.