Skip to content

Conversation

Copy link

Copilot AI commented Nov 8, 2025

Implementation Complete: SARIF partialFingerprints for Deduplication

This PR resolves the issue of duplicate vulnerability alerts in GitHub's security tab by adding the partialFingerprints property to SARIF output.

What Changed

  • ✅ Explored the repository and understood SARIF generation code
  • ✅ Researched GitHub's requirements for partialFingerprints
  • ✅ Implemented stable fingerprint generation using SHA-256 hashing
  • ✅ Updated SARIF output to include partialFingerprints
  • ✅ Updated all snapshot tests (51 snapshots)
  • ✅ Added comprehensive unit tests for fingerprint generation
  • ✅ Fixed all linting issues
  • ✅ Ran security analysis (0 vulnerabilities)
  • ✅ All tests passing
  • ✅ Added detailed documentation and comments
  • ✅ Reverted unrelated test data changes
  • ✅ Refactored tests per upstream review feedback
  • ✅ Unified test parameters with matrix approach for both tests

Implementation Details

New Function: createSARIFFingerprint()

  • Generates a deterministic SHA-256 hash based on:
  • Returns a 64-character hex string
  • Detailed documentation explains why these three components are used together

SARIF Output Enhancement

  • Each result now includes: "partialFingerprints": {"primaryLocationLineHash": "<hash>"}
  • Uses primaryLocationLineHash key as recommended by GitHub's documentation
  • Added extensive inline documentation with reference to official GitHub docs
  • Enables GitHub to properly deduplicate findings across scans

Tests Added

  • sarif_fingerprint_test.go: Unit tests verifying:
    • Deterministic behavior (same input = same hash) - tested across all 16 combinations
    • Uniqueness (different inputs = different hashes) - verified across all 16 combinations
    • Correct format (64-character hex string)
    • Both tests use common parameter sets (2 vuln IDs × 2 paths × 4 package variants)
    • Matrix approach ensures comprehensive coverage

Benefits

Eliminates duplicate alerts in GitHub Security tab
Improves signal-to-noise ratio for security teams
Follows GitHub's SARIF best practices
Zero breaking changes - backward compatible

Security

🔒 CodeQL scan: 0 alerts
🔒 No new vulnerabilities introduced

Testing

✅ All 51+ snapshot tests passing
✅ New unit tests for fingerprint generation (unified matrix-style for comprehensive coverage)
✅ Linting clean (golangci-lint)
✅ No breaking changes to existing functionality

Upstream PR

Opened at google#2333

Original prompt

This section details on the original issue you should resolve

<issue_title>Deduplicate SARIF outputs in GitHub security tab</issue_title>
<issue_description>Is there a reason SARIF output does not contain the partialFingerprints property? I see many duplicates in GitHub's security tab which I think would be resolved by this property being populated based on GitHub's docs.

I can't see any mention of this property in the repo history, so checking whether there's a reason we don't generated it and whether populating it would be a welcome feature to add.

I am using the latest version 2.2.4 but this has been an issue for at least the past few versions.</issue_description>

Comments on the Issue (you are @copilot in this section)


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI changed the title [WIP] Add partialFingerprints property to SARIF outputs Add partialFingerprints to SARIF output to enable GitHub deduplication Nov 8, 2025
Copilot AI requested a review from brabster November 8, 2025 17:47
Copilot finished work on behalf of brabster November 8, 2025 17:47
@@ -1,33 +1,5 @@
github.com/gogo/protobuf v1.3.1 h1:DqDEcV5aeaTmdFBePNpYsp3FlcVH/2ISVVM9Qf8PSls=
github.com/gogo/protobuf v1.3.1/go.mod h1:SlYgWuQ5SjCEi6WLHjHCa1yvBfUnHcTbrrZtXPKa29o=
github.com/ipfs/go-bitfield v1.0.0 h1:y/XHm2GEmD9wKngheWNNCNL0pzrWXZwCdQGv1ikXknQ=
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are all these lines deleted? What do they have to do with the issue and resolution? If they are not directly related to the change, revert these deletions and instead make a pull request comment to recommend these deletions with explanation

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot see this and other review commenta

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines were unrelated test dependency changes that were accidentally included when running tests. Reverted in commit 0d850c8.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All review comments have been addressed in commit 0d850c8. The unrelated go.mod/go.sum changes have been reverted, and detailed documentation with references has been added. The PR now only contains changes to the three relevant files: sarif.go, sarif_test.snap, and sarif_fingerprint_test.go.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All review comments have been addressed in commit 0d850c8. The unrelated test dependency changes have been reverted, detailed documentation has been added explaining the parameter choices and primaryLocationLineHash usage with GitHub documentation references.

require github.com/gogo/protobuf v1.3.1

require github.com/ipfs/go-bitfield v1.0.0

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this deleted? What does it have to do with the issue and resolution? If it is not directly related to the change, revert the deletions and instead make a pull request comment to recommend the deletions with explanation

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was an unrelated test dependency change accidentally included when running tests. Reverted in commit 0d850c8.

github.com/ipfs/go-bitfield v1.0.0/go.mod h1:N/UiujQy+K+ceU1EF5EkVd1TNqevLrCQMIcAEPrdtus=
github.com/kisielk/errcheck v1.2.0/go.mod h1:/BMXB+zMLi60iA8Vv6Ksmxu/1UDYcXs4uQLJ+jE2L00=
github.com/kisielk/gotool v1.0.0/go.mod h1:XhKaO+MFFWcvkIS/tQcRk01m1F5IRFswLeQ+oQHNcck=
github.com/yuin/goldmark v1.4.13/go.mod h1:6yULJ656Px+3vBD8DxQVa3kxgyrAnzto9xy5taEt/CY=
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are all these lines deleted? What do they have to do with the issue and resolution? If they are not directly related to the change, revert these deletions and instead make a pull request comment to recommend these deletions with explanation

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines were unrelated test dependency changes accidentally included when running tests. Reverted in commit 0d850c8.


// createSARIFFingerprint generates a stable fingerprint for a SARIF result
// to help GitHub deduplicate findings across scans.
// The fingerprint is based on: vulnerability ID, artifact path, package name, and package version.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explain why this combination of parameters

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added detailed documentation in commit 0d850c8. The function uses three parameters (vulnID, artifactPath, pkg) because together they uniquely identify a specific vulnerability finding: the same vulnerability in the same package at the same location should always be considered the same finding and produce the same fingerprint across scans. This ensures both uniqueness (different vulnerabilities/packages/locations produce different fingerprints) and stability (the same finding always produces the same fingerprint).

))
)).
WithPartialFingerprints(map[string]string{
"primaryLocationLineHash": fingerprint,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explain why primaryLocationLineHash. Provide link to any relevant documentation and ensure correctness and authority of information

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added detailed documentation with reference link in commit 0d850c8. The primaryLocationLineHash key is the standard that GitHub Advanced Security uses to deduplicate code scanning alerts. Reference: https://docs.github.com/en/code-security/code-scanning/integrating-with-code-scanning/sarif-support-for-code-scanning#preventing-duplicate-alerts-using-fingerprints

GitHub's documentation states: "GitHub uses the primaryLocationLineHash property to detect results that are logically the same, so they can be shown only once, in the correct branch and pull request."

require github.com/gogo/protobuf v1.3.1

require github.com/ipfs/go-bitfield v1.0.0

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are all these lines deleted? What do they have to do with the issue and resolution? If they are not directly related to the change, revert these deletions and instead make a pull request comment to recommend these deletions with explanation

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines were unrelated test dependency changes accidentally included when running tests. Reverted in commit 0d850c8.

github.com/ipfs/go-bitfield v1.0.0/go.mod h1:N/UiujQy+K+ceU1EF5EkVd1TNqevLrCQMIcAEPrdtus=
github.com/kisielk/errcheck v1.2.0/go.mod h1:/BMXB+zMLi60iA8Vv6Ksmxu/1UDYcXs4uQLJ+jE2L00=
github.com/kisielk/gotool v1.0.0/go.mod h1:XhKaO+MFFWcvkIS/tQcRk01m1F5IRFswLeQ+oQHNcck=
github.com/yuin/goldmark v1.4.13/go.mod h1:6yULJ656Px+3vBD8DxQVa3kxgyrAnzto9xy5taEt/CY=
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are all these lines deleted? What do they have to do with the issue and resolution? If they are not directly related to the change, revert these deletions and instead make a pull request comment to recommend these deletions with explanation

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines were unrelated test dependency changes accidentally included when running tests. Reverted in commit 0d850c8.

require github.com/gogo/protobuf v1.3.1

require github.com/ipfs/go-bitfield v1.0.0

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are all these lines deleted? What do they have to do with the issue and resolution? If they are not directly related to the change, revert these deletions and instead make a pull request comment to recommend these deletions with explanation

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines were unrelated test dependency changes accidentally included when running tests. Reverted in commit 0d850c8.

github.com/ipfs/go-bitfield v1.0.0/go.mod h1:N/UiujQy+K+ceU1EF5EkVd1TNqevLrCQMIcAEPrdtus=
github.com/kisielk/errcheck v1.2.0/go.mod h1:/BMXB+zMLi60iA8Vv6Ksmxu/1UDYcXs4uQLJ+jE2L00=
github.com/kisielk/gotool v1.0.0/go.mod h1:XhKaO+MFFWcvkIS/tQcRk01m1F5IRFswLeQ+oQHNcck=
github.com/yuin/goldmark v1.4.13/go.mod h1:6yULJ656Px+3vBD8DxQVa3kxgyrAnzto9xy5taEt/CY=
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are all these lines deleted? What do they have to do with the issue and resolution? If they are not directly related to the change, revert these deletions and instead make a pull request comment to recommend these deletions with explanation

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines were unrelated test dependency changes accidentally included when running tests. Reverted in commit 0d850c8.

@brabster
Copy link
Owner

brabster commented Nov 8, 2025

@copilot see review comments, changes and clarifications needed

Copilot AI requested a review from brabster November 8, 2025 18:19
Copilot finished work on behalf of brabster November 8, 2025 18:19
@@ -1,33 +1,5 @@
github.com/gogo/protobuf v1.3.1 h1:DqDEcV5aeaTmdFBePNpYsp3FlcVH/2ISVVM9Qf8PSls=
github.com/gogo/protobuf v1.3.1/go.mod h1:SlYgWuQ5SjCEi6WLHjHCa1yvBfUnHcTbrrZtXPKa29o=
github.com/ipfs/go-bitfield v1.0.0 h1:y/XHm2GEmD9wKngheWNNCNL0pzrWXZwCdQGv1ikXknQ=
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot see this and other review commenta

@brabster brabster marked this pull request as ready for review November 8, 2025 18:56
Copilot AI requested a review from brabster November 8, 2025 18:56
@brabster brabster closed this Nov 8, 2025
@brabster brabster reopened this Nov 14, 2025
@brabster
Copy link
Owner

@copilot note new comments in issue:

PR raised on upstream repo here google#2333

Initial implementation has some problems.

in sarif_fingerprint_test.go

I think the current tests are mostly fine, as the second test is covering something completely separate from the first. Though can you move the second test into a separate Test_ function?

this test could do with some rework - rather than have a table test with the same inputs, and then a second (sub?) test that tests a bunch of other packages, I think we should have a single collection of table tests, and within each test we can run the function twice to do the deterministic test

There is an additional, unrelated step:

To make the snapshot tests pass, you need to run with UPDATE_SNAPS=true env variable when running the tests.

Review and propose changes

- Moved "different inputs" test to separate Test_ function
- Removed duplicate test case with same inputs
- Added diverse test cases covering different scenarios
- Each test verifies determinism by calling function twice
- All tests still pass

Co-authored-by: brabster <[email protected]>
Copy link
Author

Copilot AI commented Nov 14, 2025

@copilot note new comments in issue:

PR raised on upstream repo here google#2333
...

Refactored tests per upstream review feedback in commit a37eb03:

  1. Moved second test to separate function: The "different inputs produce different fingerprints" test is now in Test_createSARIFFingerprint_DifferentInputs()

  2. Removed duplicate test cases: Instead of having two table entries with identical inputs, now each test case has unique inputs and verifies determinism by calling the function twice within the test

  3. Added diverse test scenarios: The table now covers different vulnerability IDs, artifact paths, package names, versions, and commits

Regarding the snapshot tests: The snapshots were already updated in commit 78ee759 using UPDATE_SNAPS=always environment variable as mentioned in the review.

Copilot finished work on behalf of brabster November 14, 2025 07:25
@brabster
Copy link
Owner

@copilot Can we refactor the DifferentInputs test to be more in a matrix style to ensure that all combinations are tested without enumerating them all?

- Uses nested loops to generate all combinations of test dimensions
- Tests 2 vulnIDs × 2 artifactPaths × 4 packages = 16 combinations
- More maintainable: easy to add new test values without manual enumeration
- Better error messages showing both inputs when fingerprints collide
- Verifies all combinations produce unique fingerprints

Co-authored-by: brabster <[email protected]>
Copy link
Author

Copilot AI commented Nov 14, 2025

@copilot Can we refactor the DifferentInputs test to be more in a matrix style to ensure that all combinations are tested without enumerating them all?

Refactored to matrix-style in commit 2719d92. The test now:

  • Defines test dimensions (2 vulnIDs, 2 artifactPaths, 4 package variants)
  • Generates all 16 combinations using nested loops
  • Verifies all combinations produce unique fingerprints
  • Provides detailed error messages showing both inputs if fingerprints collide
  • Easy to extend by adding new values to any dimension

Copilot finished work on behalf of brabster November 14, 2025 08:10
@brabster
Copy link
Owner

@copilot OK - I note that both tests depend on the similar sets of parameter values. I want to make a further refactor to have both tests depend on a common set of parameters values, and then have both tests take a similar matrix approach to ensure all combinations are tested

- Extracted common test parameters (vulnIDs, artifactPaths, packages) to package-level variables
- Both tests now use the same parameter sets for consistency
- Test_createSARIFFingerprint uses matrix approach to generate all 16 combinations
- Test_createSARIFFingerprint_DifferentInputs uses same parameters to verify uniqueness
- Easy to extend by modifying common parameter arrays
- Ensures both tests cover the exact same input space

Co-authored-by: brabster <[email protected]>
Copy link
Author

Copilot AI commented Nov 14, 2025

@copilot OK - I note that both tests depend on the similar sets of parameter values. I want to make a further refactor to have both tests depend on a common set of parameters values, and then have bot...

Unified in commit 86c6cf8. Both tests now:

  • Share common parameter arrays defined at package level (testVulnIDs, testArtifactPaths, testPackages)
  • Use matrix approach to generate all 16 combinations (2 × 2 × 4)
  • Test_createSARIFFingerprint verifies determinism for each combination
  • Test_createSARIFFingerprint_DifferentInputs verifies all combinations produce unique fingerprints
  • Easy to extend by modifying the shared arrays

Copilot finished work on behalf of brabster November 14, 2025 08:21
@brabster
Copy link
Owner

@copilot make sure the tests fail if there is a breaking change in the logic. Don't commit any changes unless there is a problem

Copy link
Author

Copilot AI commented Nov 14, 2025

@copilot make sure the tests fail if there is a breaking change in the logic. Don't commit any changes unless there is a problem

The tests are already robust against breaking changes:

  1. Uniqueness test: Verifies all 16 combinations (2 vulnIDs × 2 paths × 4 packages) produce different fingerprints. This would fail if the implementation stopped using any of the three parameters.

  2. Determinism test: Verifies same inputs always produce same output. This would fail if randomness was introduced.

  3. Length test: Verifies 64-character output (SHA-256 hex). This would fail if the hash algorithm changed to most alternatives.

The matrix approach ensures comprehensive coverage - any change that breaks the core logic (using all three parameters to generate unique, deterministic fingerprints) will cause test failures. No changes needed.

Copilot finished work on behalf of brabster November 14, 2025 08:27
@brabster
Copy link
Owner

internal/output/sarif_fingerprint_test.go:83 @copilot why can't we use range here?

Copy link
Author

Copilot AI commented Nov 14, 2025

internal/output/sarif_fingerprint_test.go:83 @copilot why can't we use range here?

We can use range here! The current code uses j := i + 1 to avoid comparing pairs twice and comparing elements with themselves. This could be refactored to:

for j := range testCases[i+1:] {
    actualJ := i + 1 + j
    if testCases[i].fingerprint == testCases[actualJ].fingerprint {
        // ...
    }
}

However, the current form is clearer and more direct for this nested comparison pattern. The explicit indexing makes it immediately obvious we're comparing each element with all subsequent elements. Would you like me to refactor it to use range?

Copilot finished work on behalf of brabster November 14, 2025 08:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Deduplicate SARIF outputs in GitHub security tab

2 participants