/learn

AI Agent Security Guides

In-depth, practical guides to the attack classes and defenses shaping the AI agent security landscape. Written from the perspective of a red team, not a product marketer.

Reference

AI Bug Bounty Programs 2026: xAI/Grok, OpenAI, Anthropic + Payouts

Every active AI bug bounty program in 2026: xAI/Grok, OpenAI ($100K max), Anthropic, Google, Microsoft, Cohere, Mozilla 0din, Gray Swan Arena. Verified scopes, payout ranges, and where to submit.

8 min read·22 sections

Career Guide

AI Pentesting Certification: How to Become an AI Pentester in 2026

A practical roadmap to AI pentesting: what the job actually is, the skills and attack classes you need, the free training path that mirrors a real exam, and how an AI pentesting certification proves you can break a production LLM application.

9 min read·13 sections

Career

AI Security Interview Questions (with Answers), 2026

The questions AI security, AI red team, and LLM AppSec interviews actually ask, grouped by topic, each with a concise model answer. Covers fundamentals, prompt injection, agent and tool security, defenses, and scenario questions.

5 min read·6 sections

Attack Guide

Answer-Engine Poisoning: Indirect Prompt Injection Against AI Search

Answer-engine poisoning is indirect prompt injection aimed at the retrieval layer of public AI search (Google AI Overviews, ChatGPT search, Perplexity). An attacker publishes web content engineered to be cited, so the AI relays their misinformation, scam details, or instructions to everyone who asks. Here is how it works, real cases, and the defenses.

7 min read·6 sections

Attack Guide

Data Exfiltration via Markdown Images: The Quiet AI Vulnerability

Markdown image rendering is the most underrated data exfiltration channel in AI products. A working model of how it leaks system prompts, conversation history, and tool output — and the four defensive patterns that actually close the channel.

15 min read·13 sections

Attack Guide

Data Poisoning in LLMs (OWASP LLM04): How Training Attacks Work and How to Prevent Them

Data and model poisoning is OWASP LLM04: an attacker corrupts a model during training, fine-tuning, or distribution so it carries a hidden backdoor. A 2025 Anthropic study showed just 250 documents can backdoor a model of any size. This guide explains the attack classes, the real incidents, and the defenses that actually hold.

10 min read·14 sections

Career

How to Become an AI Red Teamer (2026 Roadmap)

A practical roadmap to becoming an AI red teamer or AI security engineer: what the job actually is, the skills and attack classes to learn, the tools, how to practice hands-on, and how to prove it to employers. No PhD required, but real skill is.

5 min read·10 sections

Methodology

How to Find Your First LLM Bug Bounty

A practical guide to finding your first payable vulnerability in an AI-powered application. Covers which programs accept LLM findings, what to look for, how to demonstrate impact, and the common mistakes that get reports closed.

7 min read·12 sections

Attack Guide

How to Pentest a Custom GPT, Claude Project, or Gemini Gem

A hands-on methodology for security testing a custom GPT, Claude Project, or Gemini Gem: how to extract the builder's hidden instructions, pull uploaded knowledge files, abuse connected tools, and bypass guardrails, plus the defenses that actually hold.

12 min read·13 sections

Attack Guide

Indirect Prompt Injection: The Attack That Doesn't Need the Keyboard

A complete guide to indirect prompt injection in 2026: the attack where the adversary never types a word to the AI. How it works, the five injection channels in production systems, real-world incidents, and the architectural defenses that actually hold.

15 min read·20 sections

Attack Guide

Insecure Output Handling in LLMs (OWASP LLM05): Examples and Prevention

Insecure output handling is OWASP LLM05: the failure that happens when downstream code trusts an LLM's output the way it would never trust user input. Worked examples of SQL injection, XSS, SSRF, and command injection via LLM, plus the four-layer prevention stack.

7 min read·10 sections

Attack Guide

LLM Denial of Service and Unbounded Consumption (OWASP LLM10)

When the attack is the bill. A complete guide to OWASP LLM10, unbounded consumption and denial of service against LLM apps: token floods, generation runaway, tool-call storms, denial-of-wallet, reflected amplification, and the rate-limit, quota, and circuit-breaker defenses that actually work.

5 min read·11 sections

Attack Guide

LLM Jailbreaks and Guardrail Bypass: The 2026 Field Guide

A complete reference on LLM jailbreaks and guardrail bypass: the taxonomy of techniques (roleplay, crescendo, many-shot, encoding, refusal suppression, fake-policy injection), why each one works, why the obvious defenses fail, and what layered defense actually looks like in production.

11 min read·18 sections

Attack Guide

LLM Supply Chain Security: Poisoned Models, Malicious Packages, and MCP (OWASP LLM03)

The AI supply chain is everything flowing into your system that you did not write: models, datasets, packages, and MCP servers. A complete guide to OWASP LLM03, poisoned and backdoored models, malicious model-registry uploads, hallucinated and typosquatted packages, compromised AI libraries, and the defenses that hold.

6 min read·11 sections

Attack Guide

MCP Security: The Attack Surface of Model Context Protocol

Model Context Protocol lets AI agents plug into external tools and data, and it concentrates prompt injection, excessive agency, and supply-chain risk into one connector. A complete guide to the MCP attack surface: tool poisoning, rug pulls, tool shadowing, injection via tool output, toxic agent flows, malicious servers, client RCE, and token theft, plus the defenses that hold.

10 min read·19 sections

Attack Guide

Memory Poisoning: How 'Remember This' Becomes the Side Door

Memory features in AI agents bolt a retrieval layer onto a language model and ship it as a product. The attack surface they create is more dangerous than RAG, more permanent than session injection, and almost completely undefended at the layer that matters.

16 min read·22 sections

Attack Guide

Prompt Injection: A Complete Guide for 2026

Everything you need to understand prompt injection as an AI developer or security engineer: the attack classes, why they work, why traditional defenses fail, and how to actually test for them.

10 min read·24 sections

Methodology

Red-Teaming Agentic AI: A Practitioner's Checklist

A structured methodology for security-testing AI agents with tools, memory, and multi-step reasoning. Covers the five phases of an agent red-team engagement, specific attack techniques per phase, and the artifacts you should deliver.

8 min read·10 sections

Defense Guide

Securing RAG Systems: A Practical Guide

Retrieval-Augmented Generation is the most common architecture for production AI applications. It's also one of the easiest to poison. This guide covers the five attack surfaces unique to RAG, with concrete defensive patterns for each.

8 min read·16 sections

Attack Guide

System Prompt Extraction: Techniques and Defenses

A complete reference on system prompt extraction attacks: direct, indirect, and side-channel techniques, why the obvious defenses fail, and the four-layer defense stack that actually works in production.

13 min read·17 sections

Defense Guide

The AI Agent Threat Model: A Practitioner's Guide

How to build a threat model for AI agents with tools, memory, and multi-step reasoning. Covers trust boundaries, data flows, attack surfaces, and the five questions every agent threat model must answer.

7 min read·16 sections

Reference

The OWASP Top 10 for LLM Applications, Annotated (2026 Edition)

A practitioner's walk through every item in the OWASP Top 10 for LLM Applications — what each one actually means, how attackers exploit it in the wild, why the standard mitigations fall short, and what to do instead.

16 min read·12 sections

Reference

The State of LLM Bug Bounties in 2026

A practitioner's guide to where LLM bug bounties actually pay in 2026 — program-by-program scope comparison, typical payouts, which classes of AI bugs get rewarded versus closed as 'known limitation,' and how to pick a scope that fits how you hunt.

12 min read·15 sections

Attack Guide

Tool Abuse in AI Agents: The Next SQL Injection

When AI agents have tools, prompt injection becomes catastrophic. This guide covers the taxonomy of tool abuse attacks, real-world exploitation patterns, and defensive architectures that actually constrain what an agent can do.

10 min read·27 sections