-
Notifications
You must be signed in to change notification settings - Fork 49
Implement INSERT and UPDATE value type casting
#276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
955402f to
d04ff75
Compare
ff98a29 to
a1eb12e
Compare
a1eb12e to
8bd98ae
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 5 out of 5 changed files in this pull request and generated 11 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
8bd98ae to
b7f7921
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| $this->assertQuery( "INSERT INTO t VALUES ('3.0')" ); | ||
|
|
||
| // TODO: These are supported in MySQL: | ||
| $this->assertQueryError( "INSERT INTO t VALUES ('4.5')", 'SQLSTATE[23000]: Integrity constraint violation: 19 cannot store REAL value in INTEGER column t.value' ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call describing known limitations of the system with tests
| * @throws Exception The error message. | ||
| * @return void | ||
| */ | ||
| public function throw( $message ): void { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised this is not a reserved keyword and makes a valid method name. PHP docs confirms it's safe:
They are, however, allowed as property, constant, and method names of classes, interfaces and traits, except that class may not be used as constant name.
TIL
| if ( 0 === count( $columns ) ) { | ||
| throw $this->new_driver_exception( | ||
| sprintf( | ||
| "SQLSTATE[42S02]: Base table or view not found: 1146 Table '%s' doesn't exist", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MySQL error codes! Nice
| $this->cast_value_in_non_strict_mode( $column['DATA_TYPE'], $identifier ), | ||
| $identifier | ||
| ); | ||
| $fragment .= $this->cast_value_for_insert_or_update( $column['DATA_TYPE'], $identifier ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cast_value_for_insert_or_update – Any other statements that need casting? E.g. REPLACE INTO?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adamziel Yeah, REPLACE as well. Maybe we could use a more generic name like cast_value_for_saving or something along those lines?
| * | ||
| * When the strict mode is not enabled, executing an UPDATE statement that | ||
| * sets a NOT NULL column value to NULL saves an IMPLICIT DEFAULT instead. | ||
| * This method applies relevant type casting and emulates IMPLICIT DEFAULT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's drop a couple of links to the relevant MySQL doc page in a few places as a reference:
https://dev.mysql.com/doc/refman/8.4/en/data-type-defaults.html
| string $translated_value | ||
| ): string { | ||
| $sqlite_data_type = self::DATA_TYPE_STRING_MAP[ $mysql_data_type ]; | ||
| // TODO: This is also a good place to implement checks for maximum column |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
| } elseif ( 'year' === $mysql_data_type ) { | ||
| $function_call = sprintf( "STRFTIME('%%Y', %s)", $translated_value ); | ||
| /* | ||
| * The YEAR type in MySQL only uses 1 byte and therefore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good comment
| * @return string The translated value. | ||
| */ | ||
| private function cast_value_in_non_strict_mode( | ||
| private function cast_value_for_insert_or_update( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to consider any potential interferences with SQLite's type affinity?
sqlite> create table t(a int);
sqlite> insert into t(a) values (22), ('33');
sqlite> select a from t where a = '22' or a = 33;
22
33There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adamziel This particular example behaves the same in MySQL, but generally, it can be a problem when an expression evaluates types differently, or comparisons are automatically type casted in one DB but not in the other one.
CREATE TABLE t(a DATE);
INSERT INTO t(a) VALUES ('2021-01-01');
SELECT * FROM t WHERE a = '20210101'This returns one row in MySQL but no rows in SQLite. It's probably difficult to solve—it would require inferring the correct type of every part of every expression. Fortunately, the most common numeric string cast seems to work in both DBs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adamziel Checking further, the situation is not that good:
SELECT 1 = '1'; -- true in MySQL, false in SQLite
SELECT 1 WHERE 1 = '1'; -- one row in MySQL, no rows in SQLiteSo it appears that the numeric string type cast is only applied when comparing against a numeric column.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can probably come up with a sqlite expression equivalent to MYSQL_TYPECAST( value, column_type ) => value_in_column_type that pattern-matches and transforms the value in the same way as mysql-server would. Maybe we can even source the logic from the actual mysql-server codebase
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adamziel I think that's basically what the cast_value_for_saving() method does. It knows the column type, and it has a value expression, and then it wraps it in a casting expression. We can make it more exhaustive and handle all edge cases, but the core logic is there.
What will be a problem, though, is expressions where we have no column type at all—things like WHERE '1' = 1, but with arbitrary complexity. Anyway, that's probably a topic for another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Separate pr sounds good! The transform there would be more like (lhs value, lhs type, comparison, rhs value, rhs type) -> expression that evaluates to true or false using mysql rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adamziel Oh, right. Seems like it's definitely better to tackle this separately when needed. Rewriting all expressions on every nesting level seems like a bigger topic and it could possibly cause some performance issues (like index not being used due to inline casting, etc.).
Otherwise, do you think this is good to merge?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup!
| // In non-strict mode, they get an IMPLICIT DEFAULT value. | ||
| if ( $is_strict_mode ) { | ||
| $fallback = sprintf( | ||
| "THROW('Incorrect %s value: ''' || %s || '''')", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Such a nice use of callbacks 🤯 ! At the same time – could we just throw in place? Do we need to delay it? What's the benefit of doing this vs just throwing a new exception right here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adamziel We need to throw the error from SQLite, because in PHP we don't know the value. You can insert known values (INSERT INTO ... VALUES (1, 2, 3)), as well as unknown ones INSERT INTO ... SELECT .... For that reason, we need SQLite to evaluate this for every inserted/updated row, as we can't do that in PHP. In PHP the "value" can be a column name, an expression, etc.
|
@adamziel I added two commits with small fixes. |
This PR originates from the following error (#268):
However, there isn't a simple fix. It requires correctly casting values for
INSERT,REPLACE, andUPDATEstatements, and that's what this PR implements. Type casting logic, together with "implicit defaults", was already implemented for MySQL's "non-strict" mode. This PR extends the type casting part to be used with both strict and non-strict modes.Generally, it does the following:
The PR comes with a comprehensive test suite, but it doesn't implement all tiny MySQL nuances and acknowledges some differences in TODO comments. It should increase the type compatibility when saving values significantly. We can address more edge cases separately, and it should also enable us checking for column lengths and trimming or enforcing the values based on the SQL mode.
Fixes #268.