← Incident Database
Jailbreak / Guardrail BypassMedium
Policy Puppetry universal LLM jailbreak
April 2025 · Cross-model (all major LLMs)
What happened
A single transferable prompt template disguises adversarial requests as structured "policy" files (XML/JSON/INI). Models interpret the formatted content as internal developer policy and comply, bypassing safety alignment and sometimes leaking the system prompt.
Root cause
Systemic over-trust of policy and instruction-like structured input learned during training, which makes the bypass hard to patch model-side.
Fix / outcome
No single vendor patch. Treated as a systemic alignment weakness; mitigation favors external guardrails and input inspection.
Sources
Learn this attack class
This incident is an example of Jailbreak / Guardrail Bypass. Read the guide, then try it hands-on in the Academy.