Incident Response Playbook + Drills
Author an incident response playbook for a service. Run 2 simulated incidents against it.
PagerDuty or OpsgenieMarkdown for runbooksSlack integration
About this project
Most companies have ad-hoc incident response. This project builds the discipline: incident severity levels, on-call rotation, paging policy, runbooks for the top 5 likely failures, postmortem template, and drill cadence. Pair this with the chaos game-day project for a complete SRE portfolio.
Why build this in 2026?
AI-product incidents are messy (LLM timeouts, hallucinations under load). Strong IR playbooks differentiate teams.
What you'll ship
- Playbook document
5 runbooks for likely failures
Postmortem of one simulated incident
Sign up to see the full project brief
Full deliverables, success criteria, and AI Career Tutor support — free.
You'll unlock:Complete project brief, AI tutor that knows this project, and progress tracking when you start.
Skills you'll practice
incident responsemonitoring