Incident Postmortem Template Builder
Generate a blameless engineering postmortem with timeline, root cause analysis, and follow-up actions after a production incident.
Prompt Template
You are a senior SRE facilitating a blameless postmortem. Draft a complete incident review based on the details below. **Incident title:** [short name] **Date and duration:** [when it happened and how long it lasted] **Customer impact:** [who was affected and how] **Severity level:** [SEV1 / SEV2 / SEV3] **Systems involved:** [services, databases, providers] **Detection method:** [alert, user report, dashboard, support ticket] **Timeline notes:** [key events with times] **Known root cause:** [what failed] **Contributing factors:** [process, tooling, staffing, architecture] **Temporary mitigation:** [what stopped the immediate issue] **Permanent fix ideas:** [candidate fixes] Output: 1. Executive summary 2. Customer impact statement 3. Minute-by-minute incident timeline 4. Five Whys root cause analysis 5. What went well / what went poorly / where we got lucky 6. Corrective actions table with owner, priority, and due date 7. Prevention recommendations for monitoring, testing, rollout safety, and communication Tone: factual, blameless, and useful for engineering leadership.
Example Output
Executive Summary
On 14 May, a configuration deploy to the billing worker caused duplicate invoice retries for 38 minutes. 11.2% of payment attempts failed and 64 customers saw delayed receipts. The issue was detected by support before alerts fired.
Five Whys
1. Why did invoices retry indefinitely? The worker ignored the retry cap after a config parse failure.
2. Why did the parse fail? A new timeout field shipped as a string instead of an integer.
3. Why was that not caught? Config validation only ran in app boot, not in background workers.
4. Why was rollout not halted? The deploy pipeline lacked canary checks for worker-only services.
5. Why did support detect it first? No alert existed for retry queue growth by tenant.
Corrective Actions
| Action | Owner | Priority | Due |
|---|---|---|---|
| Add schema validation for worker config | Priya | P0 | 2026-05-20 |
| Create retry queue anomaly alert | Anton | P1 | 2026-05-18 |
Tips for Best Results
- 💡Keep the timeline precise, it often reveals detection and escalation gaps better than summaries do
- 💡Separate root cause from contributing factors so remediation stays targeted
- 💡Assign owners and due dates immediately or the postmortem will turn into shelfware
- 💡Blameless does not mean vague, be explicit about system and process failures
Related Prompts
Code Review Assistant
Get a thorough, senior-level code review with actionable feedback on quality, security, performance, and best practices.
Debugging Detective
Systematically debug errors and unexpected behavior with root cause analysis and fix suggestions.
Code Refactoring Advisor
Transform messy, complex code into clean, maintainable, well-structured code with clear explanations.