-
Notifications
You must be signed in to change notification settings - Fork 1.7k
fix: SortPreservingMerge sanity check rejects valid ORDER BY with CASE expression #18342
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
alamb
merged 7 commits into
apache:main
from
watford-ep:bug/18327-orderby-case-sanity-check-failure
Oct 30, 2025
Merged
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
da4098a
Add failing test cases for #18327
watford-ep ad27a36
Don't fail bounds checks if lhs/rhs is null #18327
watford-ep 291526d
move test
alamb c8effb6
Merge remote-tracking branch 'apache/main' into bug/18327-orderby-cas…
alamb dffb0c2
Add test suggested by @asolimando
alamb 6f35668
Apply suggestion from @alamb
watford-ep 2b447e2
Add tests for internal arithmetic and boolean short circuiting
watford-ep File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This make sense to me, under the classic 3-valued logic, in such cases it's fine to return
unknown.This method should be used with care as it would probably be unsafe for simplifications, as the TODO suggests we are not handling cases like
FALSE AND UNKNOWNwhich isFALSE, orTRUE OR UNKNOWNwhich isTRUE.Depending on where we invoke this method, we need to be aware of the unknown-as-false semantics (filters) and unknown-as-null semantics (order by, projections, etc.), and decide if it is safe to use this method or not.
Since we were previously erroring out, I think it's in any case an improvement but just wanted to point that out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to find places that would be confused by this change, but there is quite a bit of logic around Interval and tracing the implications are a bit beyond me as a first time datafusion contributor. In a good way it seems
Interval::UNCERTAINis handled uniformly from what I found. There also seems to be test coverage in the optimizer if those TODOs were executed on (test_simplify_with_guarantee, test_inequalities_maybe_null, etc).@alamb @asolimando I'm happy to explore adding those bounds here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very interesting...I started by adding tests for and/or:
And they already worked without any code changes, so I think my TODO statements were premature and unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@watford-ep sounds good! Can you also add the reverse (
true or nullwhere you now testnull or true) just to make sure that we short-circuit the right way independently from the operands' order?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.