fix: SMALLINT addition overflow should error instead of silently wrapping#21319
fix: SMALLINT addition overflow should error instead of silently wrapping#21319xiedeyantu wants to merge 5 commits intoapache:mainfrom
Conversation
|
The datafusion-testing repository should be updated to align with this change. I have forked the datafusion-testing project on my GitHub and submitted a apache/datafusion-testing#18. I'm not sure about the merging process. The current step is to update the submodule using the datafusion-testing PR. |
|
@alamb Could you please take a look and let me know if this issue needs to be fixed? |
|
I am trying to figure out if this is an theroetical concern or if it is an error that someone has actually reported? I just worry it will introduce churn / failures for people's whose queries are currently running fine. |
In any case, I believe this must be an error. However, different query engines may exhibit their own distinct behaviors. If you feel that this issue does not require a fix, we can certainly leave things as they are. |
Which issue does this PR close?
Rationale for this change
DataFusion was silently wrapping on integer arithmetic overflow instead of returning an error. For example:
This behavior is incorrect. DuckDB and PostgreSQL both return an error on integer overflow, which is the expected behavior for SQL engines.
What changes are included in this PR?
BinaryExpr::new (binary.rs): Change the default value offail_on_overflowfromfalsetotrue. This causes+,-, and*on integer types to use Arrow's checked kernels instead of the wrapping variants, returning an error on overflow.dynamic_filters.rsandphysical_planner.rsto reflect the new default.explain_analyze.sltto use slt:ignore foroutput_byteson integer arithmetic projections, since Arrow's checked kernels may allocate a validity buffer, slightly changing the reported byte size.operator.sltcovering overflow forTINYINT,SMALLINT,INT, andBIGINTwith+,-, and*.Are these changes tested?
Yes.
operator.sltverify that overflow on all integer widths (Int8,Int16,Int32,Int64) returns an error for addition, subtraction, and multiplication.test_add_with_overflow,test_subtract_with_overflow, andtest_mul_with_overflowinbinary.rscontinue to pass (they already usedwith_fail_on_overflow(true)explicitly).Are there any user-facing changes?
Yes. Integer arithmetic (
+,-,*) on typesTINYINT,SMALLINT,INT, andBIGINTnow returns an error on overflow instead of silently wrapping. This aligns DataFusion's behavior with DuckDB and PostgreSQL.