Skip to main content
Intermediate ~16 hours

Incident Response Playbook + Drills

Author an incident response playbook for a service. Run 2 simulated incidents against it.

PagerDuty or OpsgenieMarkdown for runbooksSlack integration

About this project

Most companies have ad-hoc incident response. This project builds the discipline: incident severity levels, on-call rotation, paging policy, runbooks for the top 5 likely failures, postmortem template, and drill cadence. Pair this with the chaos game-day project for a complete SRE portfolio.

Why build this in 2026?

AI-product incidents are messy (LLM timeouts, hallucinations under load). Strong IR playbooks differentiate teams.

What you'll ship

  • Playbook document
5 runbooks for likely failures
Postmortem of one simulated incident

Sign up to see the full project brief

Full deliverables, success criteria, and AI Career Tutor support — free.

You'll unlock:Complete project brief, AI tutor that knows this project, and progress tracking when you start.

Skills you'll practice

incident responsemonitoring