HackerOne Large Language Model (LLM) Pentest is a point-in-time security assessment designed to identify technical vulnerabilities in applications that incorporate or rely on large language models. The service provides audit-ready outputs aligned with compliance frameworks and industry methodologies, with coverage tailored to the complexity and risk level of the AI system under test.

This page outlines what is included in an AI and LLM Pentest, the differences between the Add-on and Standalone offerings, and the criteria used during scoping.

When to Choose an LLM Application Pentest

Select this assessment when you need to:

Validate the security of LLM-supported functionality before launching to production
Meet compliance requirements such as EU AI Act, NIST AI RMF, or internal audit mandates
Assess risks introduced by model prompts, data inputs, RAG pipelines, agents, and external integrations
Provide technical assurance for AI native or AI-enhanced products
Demonstrate secure use of third-party model providers or managed AI services.

LLM Application Pentesting Offerings

HackerOne provides two methodologies: Add-on and Standalone. These options reflect the risk level and complexity of the AI system being tested.

LLM Application Add-On Pentest

The Add-on methodology provides focused testing for applications with AI features that supplement, rather than define, the core product. This option adds AI-specific risk coverage to a standard application pentest.

Category	Details
Best for	Chat or summarization features within existing applications Simple classification or recommendation features AI functionality that enhances but does not drive system behavior
Coverage objectives	Prompt injection Sensitive information disclosure Improper output handling System prompt leakage Misinformation Unbounded consumption

LLM Application Standalone Pentest

The Standalone Pentest is a deep dive methodology for AI-native or AI-intensive systems. It evaluates model behavior, agent frameworks, vector and embedding pipelines, system prompts, data ingestion, and the complete AI integration surface. This option expands the core methodology with advanced adversarial analysis and supply chain evaluation.

Category	Details
Best for	Products where AI is central to functionality Systems with multi-agent workflows Models or frameworks that perform planning, tool calling, or autonomous actions Applications with vector stores, embeddings, or large-scale retrieval processes Environments where LLM outputs directly impact decisions, workflow execution, or user experience
Coverage objectives	Standalone coverage includes all add-on objectives plus advanced security areas: Prompt injection Sensitive information disclosure Supply chain vulnerabilities Data and model poisoning Improper output handling Excessive agency System prompt leakage Vector and embedding weaknesses Misinformation Unbounded consumption Agentic AI security threats AI safety considerations

Why Standalone Is the Primary Fit for AI Products

Modern AI systems combine inputs, prompts, data stores, model pathways, and autonomous capabilities, introducing unique and often interconnected risks. Standalone testing evaluates these elements comprehensively, uncovering deep vulnerabilities that may not be revealed by lightweight testing.

Scoping Requirements

To determine the correct methodology and calculate testing hours, HackerOne collects information about the AI system, including:

Number of AI functions or user-facing operations
Use of Retrieval Augmented Generation or vector stores
Number and type of AI agents
Differences in behavior across user roles
Guardrails in place, such as moderation, filtering, or content policies
Model or provider used
Rate limits at the application, gateway, and provider levels
Whether the asset is publicly accessible

Phases of a Standalone LLM Application Pentest

Our LLM Application Pentests follow a structured approach to ensure comprehensive coverage and actionable results.

Getting Started: Environment and access preparation, confirmation of rate limits, and alignment on test boundaries.

Researcher Selection and Vetting: Pentesters are selected based on experience with LLM architectures, system prompt analysis, embeddings, RAG pipelines, and adversarial techniques.
Methodology: Testing draws from the following frameworks:

OWASP Top 10 for LLM Applications
MITRE ATLAS
HackerOne Penest Methodolgy V3.9 https://marketing-assets.hackerone-user-content.com/HackerOne_H1P_Methodology_V3.9.pdf

Standalone assessments include additional tasks for embedding analysis, agent risk, supply chain review, and model poisoning scenarios.

Reporting: The final audit-ready report includes:
- Verified findings with severity ratings
- Clear reproduction steps
- Remediation guidance
- Mapping to compliance frameworks

Retesting: We offer retesting to verify that identified vulnerabilities have been effectively remediated.
Deliverables: Every AI and LLM Pentest includes:
- Detailed assessment report
- Technical findings with evidence
- Severity and impact analysis
- Documentation suitable for audits and compliance reviews
- Executive summary for leadership
- Optional retesting after remediation

How AI and LLM Pentesting Differs from AI Red Teaming

This comparison helps customers validate that they have selected the correct product after understanding the pentest offering.

Aspect	LLM Application Pentest	AI Red Teaming
Goal	Identify technical vulnerabilities for compliance and security assurance.	Identify misuse pathways and unsafe behaviors
Approach	Structured methodology and checklists	Creative scenario-based testing
Focus	Prompt injection, data leakage, poisoning, embeddings, and agent abuse.	Bias, harmful outputs, guideline violations
Team	A focused team of vetted pentesters.	A diverse community of specialized AI researchers.

Expanded LLM Application Pentesting Methodology

AI security has evolved at breakneck speed, and so have the threats. Over the past year, our pentesters have uncovered new classes of vulnerabilities unique to large language models and AI-powered systems. To stay ahead, HackerOne has expanded its AI and LLM testing methodology to meet the next generation of risk head-on.

The new framework introduces 12 core domains, 89 targeted test cases, and a deeper focus on agentic AI, retrieval-augmented generation (RAG) systems, and AI safety: setting a new standard for how enterprises secure intelligent systems.

Key Updates

Expanded attack surface coverage: New checks for vector/embedding weaknesses, agentic AI threats, and AI safety mechanisms
Sophisticated attack techniques: Integration of multi-modal attacks, adversarial methods, and cross-modal exploitation vectors
Supply chain depth: Focus on AI Bills of Materials (AI-BOM) and model provenance analysis
Granular test specifications: Each check now contains 8-15 specific validation points based on common threat models HackerOne has observed in AI systems over the past year

Methodology Evolution

The updated methodology expands prompt injection testing to include multi-modal and cross-modal attacks, recognizing that vulnerabilities now span text, image, and audio interfaces. Our testers now simulate adversarial suffix attacks and token-smuggling methods that exploit LLM tokenization boundaries, mirroring the techniques observed in real-world exploit attempts across enterprise AI deployments.

These enhancements are informed by findings from hundreds of AI system tests run through the HackerOne community. Many of the researchers who contributed to this update also help maintain leading open-source AI security tools, ensuring our methodology reflects the most current, adversarial testing practices in the field.

Advanced Prompt Injection Testing

Prompt injection testing emerged quickly in security for AI domain, with a heavy focus on well-documented attack patterns, such as jailbreaking, virtualization, and role-playing. Since then, attackers have developed significantly more sophisticated techniques.

New Additions

Multi-modal injection vectors: Testing now includes image, audio, and video-based prompt injections, reflecting the widespread adoption of multi-modal models like GPT-4V and Gemini.
Adversarial suffix attacks: Incorporation of research on gradient-based adversarial prompts that can reliably bypass safety mechanisms
Cross-modal attack vectors: Recognition that attackers can leverage one modality to compromise another (e.g., using images to inject text-based prompts)
Token smuggling techniques: Advanced obfuscation methods that exploit tokenization boundaries

Enhanced Data Protection Testing

Early security testing of AI systems and LLM-powered functionality acknowledged sensitive information disclosure through prompt crafting; today, modern attacks go far beyond simple data extraction.

New Additions

Model inversion and membership inference attacks: These privacy attacks can reconstruct training data or determine if specific data was used in training
Multi-tenant data isolation testing: Essential for SaaS deployments where multiple organizations share infrastructure
Differential privacy bypass attempts: Testing whether mathematical privacy guarantees can be circumvented

AI Supply Chain Security

The evolution from basic dependency checking to comprehensive AI supply chain analysis reflects the unique risks in the AI ecosystem.

New Additions

AI Bills of Materials (AI-BOM) Evaluation: Evaluate tracking models, datasets, and training procedures provided through installation chains and dependency manifests
Model registry access controls: Recognition that model artifacts require specialized security controls
Backdoor detection in pre-trained models: Addressing research showing that popular pre-trained models can contain hidden malicious behaviors

Vector and Embedding Security

The explosive growth of Retrieval-Augmented Generation (RAG) systems has created an entirely new attack surface that didn't exist when the original methodology was developed.

Major vector database providers, such as Pinecone and Weaviate, have begun implementing security features specifically to address these concerns, validating the importance of this new testing category.

New Additions

RAG/knowledge-base poisoning: Malicious embeddings that can poison semantic search results
Cross-tenant context leakage: Privacy violations in shared vector stores
RAG system exploitation: Manipulating retrieval mechanisms to control model outputs

Agentic AI Security

The rise of autonomous AI agents in 2025 has represented a paradigm shift from passive Q&A systems to active decision-makers as enterprises have begun to realize scale efficiencies by inserting agentic AI into operations.

New Additions

MCP (Model Context Protocol) security: Testing implementations for security flaws or insecure configurations in agentic workflows.
Agent goal manipulation: Ensuring agents cannot be redirected from their intended objectives
Cascading failure induction: Preventing single compromised agents from affecting entire networks
AI-powered social engineering: Detection of agents’ susceptibility to manipulate humans or other agents

Granular Output Handling Vulnerabilities

LLM02: Insecure Output Handling was updated to LLM05:2025 Improper Output Handling in the OWASP Top 10 for LLM Applications 2025. HackerOne’s penetration testing methodology breaks down the high-level concept to a detailed taxonomy of 12 specific injection types. This granularity reflects observed attacks where LLMs have been used as vectors for traditional web vulnerabilities, requiring testers to understand both AI and conventional security domains.

New Additions

Markdown image injection
LDAP injection via LLM
Template injection attacks
Deserialization vulnerabilities

AI Safety and Alignment Testing

This category addresses the growing concern around AI safety and responsible deployment. This addition reflects the industry's shift toward responsible AI practices and the emergence of global regulatory requirements.

New Additions

Safety policy jailbreak probes: Testing robustness of safety training
Harm prevention controls: Validating the effectiveness of content filters
Bias detection and mitigation: Ensuring fair and ethical model behavior

What This Means for Security Teams

The updated methodology represents a fundamental shift in how we approach AI/LLM security testing:

Tool Ecosystem Maturity: The development of specialized AI security testing frameworks is reflected in the breadth of testing tooling now available and actively maintained within the open-source community. Many maintainers and creators of these tools are members of the HackerOne Pentester Community and have worked closely with our core team to understand the toolkits attackers are building to compromise AI systems.
Regulatory Alignment: Testing now directly maps to the EU AI Act and U.S. AI safety mandates, simplifying compliance readiness.
Proactive Threat Modeling: Our approach anticipates emerging attack vectors based on AI capability trajectories, so your defenses mature before attackers do.

Pentest Phases and Terminology

HackerOne LLM Application Pentest

When to Choose an LLM Application Pentest

LLM Application Pentesting Offerings

LLM Application Add-On Pentest

LLM Application Standalone Pentest

Why Standalone Is the Primary Fit for AI Products

Scoping Requirements

Phases of a Standalone LLM Application Pentest

How AI and LLM Pentesting Differs from AI Red Teaming

Expanded LLM Application Pentesting Methodology

Key Updates

Methodology Evolution

Advanced Prompt Injection Testing

New Additions

Enhanced Data Protection Testing

New Additions

AI Supply Chain Security

New Additions

Vector and Embedding Security

New Additions

Agentic AI Security

New Additions

Granular Output Handling Vulnerabilities

New Additions

AI Safety and Alignment Testing

New Additions

What This Means for Security Teams