How Duplicate Detection Works

Incoming reports are analyzed against your program’s history to find potential matches. Here's what happens:

1. Finding Candidates

Report history is searched using multiple methods:

Keyword matching — Searches titles and descriptions for similar terms
Semantic similarity — Understands meaning, not just exact wording, so reports using different terminology to describe the same issue can still be matched
Technical fingerprinting — Compares specific indicators like affected endpoints, vulnerable parameters, and exploitation methods

2. Comparing Reports

For each potential match, a detailed comparison is performed:

Same vulnerability type? — Checks whether both reports describe the same category of issue (e.g., both are SQL injection, both are XSS)
Same target? — Compares the specific endpoint, parameter, or component affected
Same exploitation method? — Examines whether the attack vector and reproduction steps are identical
Same root cause? — Determines whether both reports stem from the same underlying flaw

The system weighs these factors together. Two reports might share a vulnerability type but affect different endpoints—that's not a duplicate. Or they might use different terminology but describe the exact same issue—that is a duplicate.

3. Validating the Match

Before a recommendation is surfaced, several checks are applied:

The potential original must have been submitted first
Self-closed reports without review are not used as originals
Resolved issues that reappear may be regressions, not duplicates (see below)

4. Surfacing a Recommendation

The recommendation includes:

The proposed original report
Key similarities that support the match
Guidance on whether you can act immediately or should investigate further

Closing as Duplicate

When you accept a duplicate recommendation, the report is closed as Duplicate with a reference to the original report. The reporter will see their report was closed as a duplicate and can view the original report it was linked to.

If you disagree with the recommendation, you can override it and continue triaging normally.

Duplicates vs. Regressions

Agentic validation distinguishes between true duplicates and potential regressions:

Duplicate: A new report describing the same vulnerability that's already been reported and is still open or was recently addressed.
Regression: A vulnerability that was previously fixed but has reappeared. If an original report was resolved some time ago, the submission is flagged for investigation rather than recommending it as a straightforward duplicate, because the issue may have returned after being fixed.

When a potential regression is detected, you'll see guidance to investigate whether:

The original fix was incomplete
A code change reintroduced the vulnerability
This is genuinely a new instance that should be tracked separately

What Makes Two Reports Duplicates?

The same vulnerability instance:

Identical vulnerability on the exact same target/endpoint
Same root cause and exploitation method
Reports explicitly reference the same issue

Different instances (not duplicates):

Same vulnerability type but different endpoints (e.g., SQL injection on /api/users vs /api/orders)
Same systematic issue affecting different components
Similar technique but different vulnerable elements

Example: Two SQL injection reports on your website are NOT duplicates if they affect different API endpoints—even if the root cause (inadequate input validation) is the same. They represent two separate vulnerability instances that each need to be addressed.

Agentic Duplicate Detection