Learn AI security by red-teaming it.
Target: a test AI agent. Objective: extract its secrets. Each capture teaches one real attack technique — the same ones landing against production LLM systems right now.
Featured targets
NEWReal attack techniques, wrapped in characters you'll remember.The WCAP is a hands-on AI pentest credential. Founding-cohort operators earn a free seat by capturing 5 flags.
Learn about the WCAP →Learning modules
NEWConcept · walkthrough · practice · quiz · defenses · extensions. ~45 min each.How LLMs Work (for security)
The base-layer concepts every AI security module builds on: tokens, roles, context, attention, alignment, and tool calls.
Prompt Injection
The foundational attack class. Why the instruction/data boundary doesn't exist in LLMs — and what to do about it.
Indirect Prompt Injection
When the attacker isn't the user. How malicious instructions travel through retrieved documents, emails, web pages, and tool outputs to hijack agents on someone else's behalf — and why this is the production threat model for most LLM apps shipping today.
System Prompt Extraction
How attackers leak the instructions that define your AI agent — and how to stop them.
Tool Abuse
When agents have tools, attackers have primitives. Exploiting the gap between what a tool permits and what it should allow.
Data Exfiltration
How attackers move sensitive content out of LLM applications through tool calls, rendered markdown, cross-tenant retrieval, and side channels — and why the model is the last place the defense should live.
Jailbreaks & Guardrail Bypass
How attackers route around alignment training and application-layer content rules — and why the hardening belongs at the app layer, not the model.
Insecure Output Handling
Why every conventional web vulnerability — SQL injection, XSS, SSRF, RCE — comes back when a downstream system trusts an LLM's output the way it would never trust a user's input.
Vector and Embedding Weaknesses
The attack surface nobody audits: RAG poisoning, cross-tenant retrieval leakage, embedding inversion, and reranker manipulation — why the vector database is a trust boundary, not plumbing.
Unbounded Consumption
When the attack is the bill — LLM-specific resource exhaustion through token floods, generation runaway, tool-call storms, ingestion amplification, and model extraction, and why classical rate limits miss the attack.
Memory Poisoning
Persistent memory features bolt a retrieval layer onto a language model and ship it as a product. The attack surface they create is more dangerous than RAG, more permanent than session injection, and almost completely undefended at the layer that matters.
CTF Challenges
Active operations. Each one is a deployed AI system with a secret to capture — practice components of the modules above.wraith.sh/u/<you>. The top rank is earned by holding WCAP — the Wraith Certified AI Pentester credential.