Skip to main content

Hai Security & Trust

All Audiences: Data security and confidentiality with Hai

Updated over 2 weeks ago

Overview

At HackerOne, security and transparency guide every stage of our technology development. Hai, the generative AI (GenAI) system embedded within the HackerOne platform, is built in close collaboration with customers, security researchers, and industry experts to meet the highest standards of safety, trust, and security.

Hai operates as a coordinated team of AI agents that transform findings and complex data into clear, actionable guidance. It accelerates decision-making, strengthens communication with researchers, and adapts to organizational processes and guidelines.

Hai relies on large language models (LLMs) to distill high volumes of unstructured information. To generate outputs, Hai considers proprietary insights, organization-specific context, and user-level permissions. This ensures responses are tailored to each user while protecting all customer and researcher data within the platform.

Hai Data Governance Principles

  • Hai does not train, fine-tune, or otherwise improve GenAI or large language models on customer or researcher data.

  • Authorization rules govern all RAG, tool use, and agent operations.

  • User conversations remain isolated and are not shared across accounts.

  • All inference occurs within HackerOne's secure environment using subprocessors approved for AI processing activities. Each LLM supporting Hai operates statelessly and does not retain interaction data.

  • Human approval is required before Hai performs actions.

  • Hai is fully in scope for HackerOne’s bug bounty program, including authorization boundaries and cross-user or cross-organization data access. Security researchers are encouraged to test and validate Hai’s integrity.

The following sections provide further details on these commitments. Each area offers deeper insight into how Hai safeguards data, maintains rigorous operational controls, and upholds HackerOne’s security obligations and trust standards.

Data Security and Confidentiality

Hai is designed with strong security and confidentiality protections. Vulnerability reports contain sensitive information, and HackerOne ensures they remain under customer and researcher control.

Managing data privacy within trained or fine-tuned models introduces significant complexities. Balancing granular permission sets and preventing unintended data exposure while maintaining strict access controls can introduce complex security risks. Hai takes a different approach to mitigate these risks: Hai does not train, fine-tune, or otherwise improve GenAI or large language models with customer or researcher data.

Hai’s GenAI models are stateless, meaning conversation data does not alter the model. All inferences occur entirely within HackerOne's infrastructure, including calling the model to generate a response. This ensures that HackerOne controls how conversational data is used and maintained securely.

Human-in-the-Loop Oversight

Hai defaults to explicit human-in-the-loop oversight. It requests approval before making updates, sending communications, or adjusting report attributes. Whenever Hai proposes an action—such as sending an email or changing a severity—it provides a clear, actionable prompt for approval.

Agentic features follow the same principle. Each action is logged, and users can review Hai’s behavior through detailed audit trails.

Continuous Security

HackerOne subjects Hai to ongoing testing through its bug bounty program. Security researchers evaluate authorization boundaries, cross-tenant protections, and agent behaviors. This complements HackerOne’s internal testing and reinforces the security of Hai’s design.

Hai Architecture Overview

Based on the above principles and considerations, this section provides a technical overview of how Hai operates to achieve accuracy and generate insights using foundation models.

Hai architecture diagram

Hai operates by enriching user prompts with relevant context before calling an LLM. The context enrichment relies on two key capabilities:

Retrieval-Augmented Generation (RAG)

  • Hai retrieves relevant data through the HackerOne authorization middleware.

  • Retrieved content is incorporated into the context passed to the LLM.

  • Vector embeddings help surface relevant platform documentation and publicly available HackerOne materials when needed.

Tooling (Function Calling)

  • When prompts require dynamic or structured data (e.g., filtering reports by date or status), Hai uses tools within the platform.

  • Tool calls follow the same authentication, validation, and authorization rules as any other feature.

  • Tool outputs are fed back into the LLM to refine and complete the user’s answer.

The combination of RAG and tool-based querying enables accurate, context-aware, and permission-aware responses. This same RAG flow applies to autonomous agents operating within the platform.

Autonomous Agents

  • Agents operate in response to platform events (e.g., new reports).

  • They run as designated users and can retrieve only the data they are authorized to access.

  • Outputs are visible only to users who have been cleared to view the related resource.

Use Case 1: Summarizing a Vulnerability Report

Customers often use Hai to get a summary of a vulnerability report submitted to their Bug Bounty or Vulnerability Disclosure Program to synthesize or restructure the report in a preferred format. Here's how this request is handled internally:

  1. User Request
    The user asks Hai to summarize a specific report, including a report ID (for example, “Summarize #1234 for me, please.”)

  2. Contextual Recognition
    Hai recognizes the report ID and includes its data as context to generate a relevant summary.

  3. Data Retrieval
    Hai fetches the report details on the user’s behalf. All requests go through established authorization boundaries, so Hai only accesses data the user can see.

  4. LLM Processing
    Hai sends the user’s question and any retrieved context to an approved LLM provider through a secure and controlled integration. The model generates a concise summary, and Hai returns the result to the user.

  5. Clean up
    Once the LLM processes the information and generates a summary, it resets to its original state. The interaction isn’t stored in the model—only the summary and a detailed audit log are saved in the HackerOne platform.

Hai Use Case 2: Generating Report Insights

Hai can also help with more complex questions, such as understanding how similar issues were handled in the past. This historical context helps customers stay consistent over time and saves effort by removing the need to manually search through old reports.

This type of request involves a few more steps than a simple summary. Hai guides the LLM through multiple small queries and combines the results to surface trends or past decisions from your data.

Here's how that typically works:

  1. Starting with the Report
    As with summarization, Hai begins by pulling up the report based on your request (like, “How did we handle #1234 before?”).

  2. Realizing More Info Is Needed
    To give a helpful answer, the LLM figures out it needs more context, such as how your team has handled similar issues in the past. Instead of guessing, Hai uses one of Hai’s tools to get that information.

  3. Looking Up Past Reports
    Hai uses the “report insights” tool to find relevant historical data. This request is checked to make sure it only returns information the user is allowed to access.

  4. Using That Info to Improve the Answer
    The LLM adds the new data to its understanding of the situation and uses it to create a more thoughtful, useful response.

  5. Looping Back if Needed
    If the LLM still needs more detail, Hai can ask for more data, either by adjusting the previous request or using a different tool. Access checks are enforced each time to protect data privacy. Hai keeps refining the answer until it’s accurate and useful.

RAG and tool-based workflows ensure precision and protect authorization boundaries.

Frequently Asked Questions (FAQ)

What is Hai?

Hai is a generative AI system that coordinates multiple specialized AI agents within the HackerOne platform, transforming findings into validated, actionable guidance. Hai operates under strict security and data governance controls. It relies only on data the user is authorized to access and helps accelerate remediation and decision-making.

Who can use Hai?

Hai is enabled by default for all customers and researchers.

Can I disable Hai?

Administrators can disable Hai for their organization. Doing so blocks users from accessing Hai Chat, but does not disable AI-driven capabilities used by HackerOne to support its Services.

Does HackerOne still use Hai for its Services if I disable it?

Yes. Hai is a core part of HackerOne’s services (including Hai Triage) and will continue to support those workflows even if you disable Hai for your users.

HackerOne integrates Hai throughout its service offerings as an integral part of its workflow to improve productivity and streamline triage. The system analyzes report data to confirm scope alignment, filters potential spam submissions, and evaluates reports against vulnerability criteria and program guidelines.

Can I enable or disable specific Hai features?

Controls operate at the organization level, allowing admins to enable or disable Hai for the entire organization. Granular controls are not supported for specific features or for a specific subset of users.

Does Hai share data with third parties?

No. Hai does not share customer or researcher data with external parties for the purpose of training, fine-tuning, or otherwise improving GenAI or large language models. Customer and researcher data are used only for inference at response generation time.

Hai uses pre-trained LLMs from trusted and vetted providers. When Hai sends data to an LLM for inference, it does so through secure, governed integrations that ensure the model does not store or learn from your conversational data. These providers process the data only to generate the requested output and do not retain it.

Does Hai use my data for GenAI training purposes?

As outlined above, Hai is a GenAI tool, meaning the agent uses your data to provide an answer using the pre-trained LLM. However, Hai does not train, fine-tune, or otherwise improve GenAI or large language models with customer or researcher data.

How does Hai keep my data secure?

Hai ensures that your data remains under your control and is not shared outside HackerOne without your consent. HackerOne is ISO 27001, SOC 2, and FedRAMP certified, and GDPR compliant. Hai adheres to all existing security and compliance protocols applicable across the HackerOne platform. This includes strict authentication and authorization controls, governed integrations with vetted model providers, and safeguards that prevent customer and researcher data from being used to train or fine-tune third-party models. All inference requests are handled in a way that ensures data is processed securely, not retained by model providers, and not exposed across users or organizations.

Hai is subject to all of our existing high-level security and compliance protocols, which include:

  1. Role-based access controls (RBAC)

  2. System hardening

  3. Regular patching and maintenance

  4. Robust logging

  5. At least annual AI Red Teaming & Penetration Testing

But don't just take our word for it. We invite you and any third-party researchers to validate our controls by using Hai in our bug bounty program.

Does Hai provide explainable outputs and audit logs?

Yes. Hai maintains a record of all interactions through its conversation history feature. Questions and answers are stored on the HackerOne platform, allowing users with appropriate permissions to access historical conversations. This serves as an audit trail, capturing inputs and the corresponding outputs generated.

When generating responses to a user’s prompt, Hai uses only data from the program or programs that the user has permission to access. Hai adds context to help users understand the basis for its responses and ensures traceability for how conclusions are reached.

How is data stored and retained?

Data is stored and retained in accordance with HackerOne’s standard data retention policies, privacy policies, and terms. This data is safeguarded through our established security measures. Interactions are only stored on the platform to allow users with the proper permissions to access and view historical conversations related to their program.

How does HackerOne limit unintended GenAI outcomes (toxicity, hallucinations, bias)?

Model providers such as Anthropic and Amazon manage change and release processes to identify, reduce, mitigate, and manage toxic outputs, hallucinations, and biases. The model cards provided by Amazon Titan, Anthropic Claude 4, Anthropic Haiku 4.5, and Anthropic Claude 4.5 provide more information on these risks.

How does HackerOne train staff on AI regulatory requirements?

HackerOne employees receive training on AI best practices and regulatory obligations through our annual privacy training program. Additionally, in compliance with the EU AI Act, HackerOne meets AI Literacy training requirements for employees.

What is HackerOne's approach to AI risk tolerance and impact measurement?

HackerOne defines reasonable risk tolerances for AI systems, which are informed by laws, regulations, best practices, and industry standards. HackerOne also establishes guidelines to define mechanisms for measuring or understanding an AI system's potential impacts, e.g., via regular impact assessments at key stages in the AI lifecycle connected to system impacts and the frequency of system updates.

How does HackerOne govern the use of AI tools by community members?

Security researchers have quickly built, adapted, and improved the usage of emerging AI-based technologies across the platform. We call these AI-powered hacking tools “hackbots.”

  • Community members are creative, innovative, and independent service providers who may use assistive tools, like hackbots, when participating in a customer’s program. Given the rise in the use of hackbots, HackerOne has updated its guidelines for community members to reflect this innovation:

  • All Hackbots must operate within the boundaries of the published vulnerability disclosure guidelines of the programs they interact with and must comply with HackerOne's Code of Conduct and Disclosure Guidelines.

  • AI tools are not allowed to operate fully autonomously. Our 'hacker-in-the-loop' model requires human experts to investigate, validate, and confirm all potential vulnerabilities before submitting them to any vulnerability disclosure or bug bounty program.

  • Hackbot operators are fully responsible for their AI tools and must exercise due diligence to ensure compliance with platform rules and program guidelines.

  • Human operators using Hackbots qualify for applicable rewards, just as if vulnerabilities were discovered through traditional means.

What is the Report Assistant Agent, and how does it support researchers?

The Report Assistant Agent on-platform AI capability helps community members improve clarity, completeness, and reproducibility when writing vulnerability reports. It works only with content provided by the researcher, does not submit reports autonomously, and preserves the researcher’s full responsibility and ownership of each submission. All outputs remain subject to platform rules, program guidelines, and human-in-the-loop requirements.

How does HackerOne ensure the responsible use of the Hai and researcher-operated AI tools?

HackerOne encourages careful human oversight and adherence to established disclosure practices. These guardrails support responsible innovation while protecting customers, researchers, and the broader ecosystem, and HackerOne continues to collaborate with the community to refine these practices as AI-accelerated hacking evolves.

How does HackerOne ensure AI providers meet your security requirements?

HackerOne applies the same rigorous security controls regardless of which AI provider processes a request. Every provider interaction flows through our authorization middleware, encryption layer, and audit logging. Your data receives identical protection regardless of which approved provider processes the request. The engine may differ, but the security envelope does not.

HackerOne maintains full transparency about AI providers through our published subprocessors list. Before any new provider can process customer data, they are formally added to this list. Customers subscribed to subprocessor notifications receive a 30-day notice of any additions, giving you visibility and time to review changes. This is the same process we use for all data processors—AI providers receive no special treatment or exemptions.

What if HackerOne changes its position on training LLMs with confidential customer data?

We understand and respect the significance of the confidential information you entrust to us. We do not currently train LLMs using confidential customer data. However, this is a fast-moving environment. If we ever consider doing so, we would only do so with customer permission.

How does HackerOne use AI beyond GenAI?

HackerOne leverages AI technologies beyond GenAI, including machine learning (ML) models and automation tools. These technologies enable systems to automatically learn and identify patterns, and to provide advanced capabilities such as predictive analytics, automation, personalization, and anomaly detection. For many years, HackerOne has used ML models and automation tools to analyze data, identify patterns, and improve accuracy in tasks such as vulnerability classification.

Did this answer your question?