Data Quality Incident Root Cause Analysis Builder
Investigate metric discrepancies, broken pipelines, missing events, or dashboard errors with a structured data quality incident RCA and prevention plan.
Prompt Template
You are a data reliability lead. Help investigate a data quality incident and produce a root cause analysis with prevention steps. **Incident summary:** [what looked wrong] **Metric/table/dashboard affected:** [name and link/description] **Business impact:** [wrong report, decision risk, customer-facing error, revenue impact] **Detection source:** [analyst, stakeholder, alert, data test, customer] **Time window affected:** [start/end or unknown] **Systems involved:** [source app, warehouse, ETL/ELT, BI tool, reverse ETL, spreadsheet] **Recent changes:** [deploys, schema changes, tracking changes, backfills, vendor changes] **Evidence available:** [queries, logs, screenshots, row counts, test failures] **Known constraints:** [limited lineage, no owner, missing logs, urgent board report] **Stakeholders:** [data, product, finance, marketing, executives] Produce: 1. **Incident summary** — plain-English explanation of what happened and why it matters. 2. **Impact assessment** — affected metrics, audiences, reports, date range, confidence level, and decisions at risk. 3. **Investigation plan** — step-by-step checks from source event to final dashboard. 4. **Root cause hypotheses** — ranked causes with evidence needed to confirm or reject each. 5. **SQL/query checks** — sample validation queries or logic to compare counts, freshness, nulls, duplicates, joins, and filters. 6. **RCA narrative** — timeline, root cause, contributing factors, detection gap, and resolution. 7. **Remediation plan** — data fix, stakeholder communication, dashboard annotations, and backfill validation. 8. **Prevention plan** — tests, ownership, lineage, alert thresholds, release checks, and runbook updates. 9. **Stakeholder update draft** — concise message for non-technical stakeholders. Separate confirmed facts from assumptions. Do not invent data that was not provided.
Example Output
Data Quality Incident RCA — MRR Dashboard Drop
Summary
The April MRR dashboard showed an 18% drop that was not real. The issue was caused by a billing export schema change: `plan_amount` switched from dollars to cents, while the transformation still divided by 100 only for older records.
Impact
| Area | Impact | Confidence |
|---|---|---|
| Executive MRR dashboard | April 14-16 understated MRR | High |
| Finance forecast export | Not affected; uses source billing report | High |
| Marketing cohort report | Affected for paid conversion value | Medium |
Investigation Checks
- Compare raw billing totals by day against transformed `fact_subscription_revenue`.
- Check row freshness and duplicate invoice IDs.
- Inspect schema diff from the billing connector release.
- Validate dashboard filter logic after backfill.
Prevention
1. Add an accepted range test for MRR day-over-day movement above 5%.
2. Add a schema-change alert for billing connector numeric fields.
3. Assign Finance Ops as business owner for revenue metric sign-off.
4. Annotate the dashboard and send a correction note to stakeholders.
Tips for Best Results
- 💡Start with impact and affected decisions so the investigation does not become a purely technical treasure hunt.
- 💡Compare source, staging, model, and BI layers separately to isolate where the issue entered.
- 💡Document assumptions explicitly; data incidents get messy fast when guesses become facts.
- 💡After fixing the data, update dashboards and stakeholders so stale screenshots do not keep circulating.
Related Prompts
Data Freshness SLA Audit Builder
Audit dashboard and pipeline freshness expectations with SLA definitions, stakeholder impact, alert thresholds, and remediation actions.
Refund Rate Root Cause Analysis Builder
Analyze refund spikes by product, channel, cohort, reason code, and customer segment to identify root causes and revenue-saving actions.
Support Contact Reason Pareto Analysis Builder
Analyze customer support contact reasons with Pareto tables, volume trends, cost impact, CSAT signals, and root-cause reduction opportunities.