-
Couldn't load subscription status.
- Fork 1.8k
Update diff-informed testing to always treat sources and sinks as alert locations #20607
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
a1c5d9d to
9144f52
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR updates diff-informed testing configurations to always treat sources and sinks as alert locations, aligning with the current SARIF output behavior where sources and sinks are always included in related locations.
Key changes include:
- Removing
getASelectedSourceLocationandgetASelectedSinkLocationoverrides that returnednone() - Adding source/sink locations alongside existing location selections
- Updating documentation to clarify that path-problem queries should always include source/sink locations
Reviewed Changes
Copilot reviewed 76 out of 76 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| shared/dataflow/codeql/dataflow/DataFlow.qll | Updated documentation for location selection predicates to clarify path-problem query requirements |
| Multiple language query files | Removed overrides that excluded source locations or added sink locations to existing selections |
| rust/ql/src/queries/security/CWE-614/InsecureCookie.ql | Disabled diff-informed mode for negatively-used config |
| Various Swift, C++, Java, JS, Python, Ruby files | Updated location selection to include both source/sink and existing locations |
| Location getASelectedSinkLocation(DataFlow::Node sink) { | ||
| exists(DataFlow::Node cleanSink | result = cleanSink.getLocation() | | ||
| cleanSink = sink.(DataFlow::PostUpdateNode).getPreUpdateNode() | ||
| or | ||
| not sink instanceof DataFlow::PostUpdateNode and | ||
| cleanSink = sink | ||
| ) | ||
| result = sink.(CleartextStoragePreferencesSink).getLocation() | ||
| or | ||
| result = sink.(DataFlow::PostUpdateNode).getPreUpdateNode().getLocation() |
Copilot
AI
Oct 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The location selection logic could be simplified. Consider extracting the common pattern of handling PostUpdateNode locations into a helper predicate to reduce code duplication across similar configurations.
See below for a potential fix:
/**
* Helper predicate to get the location from a PostUpdateNode's pre-update node.
*/
predicate getPostUpdateNodeLocation(DataFlow::Node node, Location loc) {
loc = node.(DataFlow::PostUpdateNode).getPreUpdateNode().getLocation()
}
Location getASelectedSinkLocation(DataFlow::Node sink) {
result = sink.(CleartextStoragePreferencesSink).getLocation()
or
getPostUpdateNodeLocation(sink, result)
| * additional location ("$@" interpolation). Queries with `@kind path-problem` | ||
| * that override this predicate should also return the location of the source | ||
| * itself. For a query that doesn't report the source at all, this predicate | ||
| * should be `none()`. |
Copilot
AI
Oct 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The documentation could be clearer about when to use none(). Consider adding an example of when a query 'doesn't report the source at all' to help developers understand this edge case.
| * should be `none()`. | |
| * should be `none()`. | |
| * | |
| * Example: | |
| * ``` | |
| * // If your query does not report the source location at all, override as: | |
| * override Location getASelectedSourceLocation(Node source) { none() } | |
| * ``` |
| exists(DataFlow::CallNode openCall | result = [openCall.getLocation(), source.getLocation()] | | ||
| isWritableFileHandle(source, openCall) |
Copilot
AI
Oct 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Using a list expression [openCall.getLocation(), source.getLocation()] for result assignment is unconventional. Consider using separate or branches for better readability and consistency with other files in this PR.
| exists(DataFlow::CallNode openCall | result = [openCall.getLocation(), source.getLocation()] | | |
| isWritableFileHandle(source, openCall) | |
| exists(DataFlow::CallNode openCall | | |
| isWritableFileHandle(source, openCall) and | |
| (result = openCall.getLocation() or result = source.getLocation()) |
| result = [target.getLocation(), source.getLocation()] | ||
| | |
Copilot
AI
Oct 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Similar to the Go file, using a list expression for result assignment is inconsistent with the pattern used elsewhere in this PR. Consider using separate or branches for consistency.
| result = [target.getLocation(), source.getLocation()] | |
| | | |
| ( | |
| result = target.getLocation() or | |
| result = source.getLocation() | |
| ) and |
java/ql/lib/semmle/code/java/security/ConditionalBypassQuery.qll
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, putting this in my own words, getASelectedSourceLocation and getASelectedSinkLocation now need to explicitly specify the source / sink location itself as one of their results, where they did not need to before. The default behaviour for configurations that don't override the predicates remains the same, and an any() implementation also remains the same as the source / sink is already included.
Rust and Swift 👍
swift/ql/src/experimental/Security/CWE-022/UnsafeUnpack.ql
actions/ql/src/Security/CWE-829/ArtifactPoisoningCritical.ql actions/ql/src/Security/CWE-829/ArtifactPoisoningMedium.ql
actions/ql/src/Security/CWE-077/EnvPathInjectionMedium.ql actions/ql/src/Security/CWE-077/EnvPathInjectionCritical.ql
actions/ql/src/Security/CWE-077/EnvVarInjectionMedium.ql actions/ql/src/Security/CWE-077/EnvVarInjectionCritical.ql
actions/ql/src/experimental/Security/CWE-088/ArgumentInjectionCritical.ql actions/ql/src/experimental/Security/CWE-088/ArgumentInjectionMedium.ql
actions/ql/src/Security/CWE-094/CodeInjectionMedium.ql actions/ql/src/Security/CWE-094/CodeInjectionCritical.ql
4b1adc8 to
a0975e7
Compare
| or | ||
| exists(FormattingFunctionCall call, Expr formatString | result = call.getLocation() | | ||
| exists(FormattingFunctionCall call, Expr formatString | | ||
| result = [call.getLocation(), sink.getLocation()] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, great. A sink location is only relevant if the below holds.
|
|
||
| Location getASelectedSourceLocation(DataFlow::Node source) { | ||
| result = getExpr(source).getLocation() | ||
| isSource(source) and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not necessary for correctness, but it reduces the size of getASelectedSourceLocation from all nodes down to the sources.
| * itself. For a query that doesn't report the source at all, this predicate | ||
| * should be `none()`. | ||
| */ | ||
| default Location getASelectedSourceLocation(Node source) { result = source.getLocation() } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the default of this predicate also include and isSource(source) (the dual question to the previous comment on why isSource(...) was introduced)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that change would maintain correctness, but I can't say a priori (before testing the performance of it) whether the predicate's smaller size (leading to smaller joins down the line) outweighs the extra join inside the predicate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Now that sources/sinks are always added to the Sarif related locations, this PR updates location overrides of diff-informed queries to match that behaviour.
Methodology
$ git grep -P 'getASelectedSourceLocation|getASelectedSinkLocation'Actions (9)
C++ (1)
C# (1)
Go (4)
Java (23)
JS (16)
Python (8)
Ruby (11)
Shared (3) (documentation only)
Swift (8)
But for some reason this didn't catch all of them. For the rest, I used
codeql test --check-diff-informedto alert me to other queries that needed to be updated because their tests were now failing.Questions