AI Prompt Injection Defense System
Build a defense layer against prompt injection: classifier, content filtering, sandboxed execution.
PythonAnthropic SDKLakera or open-source classifiersDocker
About this project
Prompt injection is the 2026 underdefended attack class. This project builds: an input classifier (detect injection attempts), output filtering (catch data exfiltration), and sandboxed tool execution (limit blast radius). Test it against the published prompt-injection benchmarks. Document what your defense catches and what slips through.
Why build this in 2026?
Prompt injection is the most-asked-about AI security topic in 2026. Almost nobody has shipped defenses.
What you'll ship
- GitHub repo
Benchmark results
Writeup of trade-offs
Sign up to see the full project brief
Full deliverables, success criteria, and AI Career Tutor support — free.
You'll unlock:Complete project brief, AI tutor that knows this project, and progress tracking when you start.
Skills you'll practice
securitylarge language modelspython