-
Couldn't load subscription status.
- Fork 239
Description
What's up?
Currently a PRQL append transform will result in UNION ALL:
from tbl_a
append tbl_b
gives
SELECT * FROM tbl_a UNION ALL SELECT * FROM tbl_b -- Generated by PRQL compiler version:0.13.3-39-ge393ab4d (https://prql-lang.org)However, this results in issues like #4724, #2680, and #3184, where the underlying cause is that UNION ALL is interpreted by the database as "unify by column position" rather than "unify by column name". See the DuckDB docs:
Traditional set operations unify queries by column position, and require the to-be-combined queries to have the same number of input columns. If the columns are not of the same type, casts may be added. The result will use the column names from the first query.
DuckDB also supports UNION [ALL] BY NAME, which joins columns by name instead of by position. UNION BY NAME does not require the inputs to have the same number of columns. NULL values will be added in case of missing columns.
Questions:
- Should
appendalways behave asUNION ALL BY NAMEto simplify semantics from the user perspective, and also make the compiler's job easier? This would resolve all of the linked issues above without needing to dive into compiler details, but would be a breaking change for users expecting traditionalUNION ALLbehavior. - If "no" to the above question, can we add a
by:nameorby:positionargument toappendto allow users to useUNION ALL BY NAMEwhen that makes sense for their use case?