Skip to main content

AI Security Bounty Engagement Best Practices

Organizations: Quick tips to optimize your programs for AI

Updated this week

AI is becoming an essential part of every business, and as such, it requires thorough security testing. Because these systems behave differently from traditional web apps, it’s critical to equip researchers with clear scope, robust testing environments, attractive incentives, and resources. The following sections outline the pillars you need in place to launch a successful AI-focused engagement and will help guide conversations with your Customer Success Manager or Technical Account Manager.

Defining a Clear, Focused Scope

Start by cataloging every AI component you intend to test—chat interfaces, API endpoints, backend orchestration layers, file-upload processors, and any downstream services the AI can reach. Explicitly list the AI-specific vulnerabilities you would like to receive reports on that are relevant to your specific business context (e.g., “the AI assistant should not be able to provide information about other users or make account changes for the current user”) and call out any additional “safety” concerns you want addressed (bias exploits, harmful content bypasses, or legal/regulatory risks). As a starting point, we recommend consulting the OWASP Top 10 for LLMS.

Equally important is stating what you will not reward (harmless hallucinations, irrelevant outputs, or bias without a security impact). Defining in-scope versus out-of-scope areas up front prevents confusion and focuses researcher effort where it matters most.

Preparing a Supportive Testing Environment

Researchers need an environment that is both safe and as close to production as possible. Ideally, customers should set up a sandbox or staging instance populated with dummy data and capable of being reset. Where a full sandbox isn’t feasible, ensure production safeguards (rate limits, content filters) are documented and adjustable for testers. Streamline access by pre-creating test accounts or API keys, and adjust rate limits so hackers can probe deeply without being blocked. Finally, freeze non-critical updates to the AI system during the bounty window (or communicate any emergency patches promptly) to avoid chasing moving targets.

Attractive, Aligned Incentives

Given the novelty and complexity of AI exploits, reward structures must reflect both effort and impact. Set competitive bounty ranges for AI findings—mapping prompt injections that leak private data to high or critical payouts—and define clear severity criteria with concrete examples.

Assuming the AI asset in scope has some security controls and has had some baseline level of testing, we recommend a minimum bounty level of the following:

  • $5,000 for Critical reports

  • $2,000 for High reports

  • $750 for Medium reports

  • $250 for Low reports

If an AI asset has already been thoroughly tested and/or requires an onerous testing setup, then a competitive bounty table would look like:

  • $10,000 for Critical reports

  • $5,000 for High reports

  • $2,000 for Medium reports

  • $500 for Low reports

However, an AI asset that is significantly hardened may warrant a higher rewards tier.

You can also boost engagement with AI-specific bonuses (e.g., manually tracking first-finder awards for specific vulnerability types). Non-monetary recognition—blog features, hall-of-fame shout-outs, or speaking opportunities—also motivates the community. Above all, transparency around how you’ll grade and reward AI vulnerabilities builds trust and drives participation.

Example Severity Criteria:

Severity level

Example vulnerability categories

Bounty payment

Critical

  • Insecure Plugin Design

  • Insecure Output Handling

  • High-Impact Prompt Injection

Examples:

RAG Data Poisoning: An Attacker can inject or overwrite retrieval sources so that every user sees malicious or misleading content.

Unauthorized Account Takeover: Attacker uses the assistant to change another user’s password or permissions without proper authentication.

$7000

High

  • Supply Chain Vulnerabilities

  • Broad Sensitive Information Disclosure / Inferred Sensitive Data

Examples:

Prompt Injection with Limited Impact: Attacker crafts a prompt that causes the assistant to execute administrative commands (e.g. modify user settings) on behalf of another user.

Context Leakage: Assistant reveals hidden system prompts or sensitive request headers.

$5000

Medium

  • Sensitive Information Disclosure / Inferred Sensitive Data about another user

  • Excessive Agency

Examples:

System Information Disclosure: Finding a way to call internal APIs that should be restricted, but no actual data or settings change occurs without additional steps.

RAG Retrieval Bypass: Attacker triggers retrieval of non‐public documents without altering them

$1500

Low

  • Low Severity Context Leakage

Examples:

  • Prompt leaks that lead to non-sensitive internal information about the model, such as its original conditioning prompt(s).

  • Sensitive Information Disclosure / Inferred Sensitive Data only about the current user.

$500

Comprehensive Documentation and Tooling

High-quality documentation is the key to a successful AI Security bug bounty program. Provide an architecture overview of the AI pipeline, sample code, or Postman collections demonstrating how to query the model as an attachment on the Security page. Summarize existing safety controls and data-handling flows so researchers know the baseline or known issues. Share any internal or community-built tools for prompt fuzzing or adversarial input generation. By lowering the learning curve, you will attract a broader pool of researchers.

Tip: Hai can generate architecture diagrams for you!

Active Communication and Educational Support

AI security is still emerging; researchers will appreciate real-time guidance. Establish a dedicated email monitored by both security and development teams, and consider a short kickoff recorded demo that you attach to the Security page or office hours to walk through the AI’s intended behavior and known boundaries. Quickly publishing clarifications or an FAQ (in either the Security page or an external document) in response to questions keeps the momentum high.

Did this answer your question?