-
Notifications
You must be signed in to change notification settings - Fork 62
fix: lossless decimal decode (fixes float64 precision loss) + reachable native-decimal opt-in #370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
vikrantpuppala
wants to merge
3
commits into
main
Choose a base branch
from
feat/native-decimal-support-correct
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,116 @@ | ||
| package arrowbased | ||
|
|
||
| import ( | ||
| "testing" | ||
|
|
||
| "github.com/apache/arrow/go/v12/arrow" | ||
| "github.com/apache/arrow/go/v12/arrow/array" | ||
| "github.com/apache/arrow/go/v12/arrow/decimal128" | ||
| "github.com/apache/arrow/go/v12/arrow/memory" | ||
| "github.com/stretchr/testify/assert" | ||
| "github.com/stretchr/testify/require" | ||
| ) | ||
|
|
||
| // TestDecimal128ContainerValue verifies that the native Arrow decimal path | ||
| // decodes DECIMAL columns losslessly as strings, preserving full precision and | ||
| // scale, and that nulls are surfaced as nil. This is the value-level coverage | ||
| // for the path exercised when UseArrowNativeDecimal is enabled (the boundary | ||
| // test in TestArrowRowScanner only proves ScanRow no longer blocks decimals). | ||
| func TestDecimal128ContainerValue(t *testing.T) { | ||
| const precision, scale int32 = 38, 18 | ||
|
|
||
| // A 38-digit, scale-18 value that float64 (≈15-17 significant digits) | ||
| // cannot represent exactly — the regression guard for lossy conversion. | ||
| const highPrecision = "12345678901234567890.123456789012345678" | ||
|
|
||
| // Note: ToString(scale) always renders to the column's full declared scale | ||
| // (18 here), so values are normalized — e.g. "3.30" -> "3.300000000000000000". | ||
| // This is lossless and deterministic, unlike a float64 conversion. | ||
| cases := []struct { | ||
| name string | ||
| input string // empty string => null | ||
| expected any | ||
| }{ | ||
| {name: "high precision preserved", input: highPrecision, expected: highPrecision}, | ||
| {name: "simple value normalized to scale", input: "3.30", expected: "3.300000000000000000"}, | ||
| {name: "negative value", input: "-0.000000000000000001", expected: "-0.000000000000000001"}, | ||
| {name: "zero", input: "0.000000000000000000", expected: "0.000000000000000000"}, | ||
| {name: "null", input: "", expected: nil}, | ||
| } | ||
|
|
||
| for _, tc := range cases { | ||
| t.Run(tc.name, func(t *testing.T) { | ||
| mem := memory.NewGoAllocator() | ||
| dt := &arrow.Decimal128Type{Precision: precision, Scale: scale} | ||
| b := array.NewDecimal128Builder(mem, dt) | ||
| defer b.Release() | ||
|
|
||
| if tc.input == "" { | ||
| b.AppendNull() | ||
| } else { | ||
| num, err := decimal128.FromString(tc.input, precision, scale) | ||
| require.NoError(t, err) | ||
| b.Append(num) | ||
| } | ||
|
|
||
| arr := b.NewDecimal128Array() | ||
| defer arr.Release() | ||
|
|
||
| container := &decimal128Container{scale: scale} | ||
| err := container.SetValueArray(arr.Data()) | ||
| require.NoError(t, err) | ||
|
|
||
| assert.Equal(t, tc.input == "", container.IsNull(0)) | ||
|
|
||
| got, err := container.Value(0) | ||
| require.NoError(t, err) | ||
| assert.Equal(t, tc.expected, got) | ||
| }) | ||
| } | ||
| } | ||
|
|
||
| // TestDecimal128NestedInListPreservesPrecision guards the DEFAULT path: complex | ||
| // types are native by default (UseArrowNativeComplexTypes=true), so a decimal | ||
| // nested inside an ARRAY/STRUCT is decoded by decimal128Container even when the | ||
| // top-level UseArrowNativeDecimal flag is off. Before the lossless-string fix | ||
| // this path went through ToFloat64 and silently corrupted high-precision values | ||
| // (and rendered SQL NULL as the number 0). This test pins the lossless behavior. | ||
| func TestDecimal128NestedInListPreservesPrecision(t *testing.T) { | ||
| const precision, scale int32 = 38, 18 | ||
| const highPrecision = "12345678901234567890.123456789012345678" | ||
|
|
||
| mem := memory.NewGoAllocator() | ||
| elemType := &arrow.Decimal128Type{Precision: precision, Scale: scale} | ||
| lb := array.NewListBuilder(mem, elemType) | ||
| defer lb.Release() | ||
| vb := lb.ValueBuilder().(*array.Decimal128Builder) | ||
|
|
||
| // One list: [highPrecision, NULL, "3.300000000000000000"] | ||
| lb.Append(true) | ||
| for _, s := range []string{highPrecision, "", "3.30"} { | ||
| if s == "" { | ||
| vb.AppendNull() | ||
| continue | ||
| } | ||
| num, err := decimal128.FromString(s, precision, scale) | ||
| require.NoError(t, err) | ||
| vb.Append(num) | ||
| } | ||
|
|
||
| listArr := lb.NewListArray() | ||
| defer listArr.Release() | ||
|
|
||
| lvc := &listValueContainer{ | ||
| listArray: listArr, | ||
| listArrayType: arrow.ListOf(elemType), | ||
| values: &decimal128Container{scale: scale}, | ||
| } | ||
| require.NoError(t, lvc.values.SetValueArray(listArr.ListValues().Data())) | ||
|
|
||
| got, err := lvc.Value(0) | ||
| require.NoError(t, err) | ||
|
|
||
| // Decimals are rendered as lossless JSON strings; NULL stays null (not 0). | ||
| expected := `["12345678901234567890.123456789012345678",null,"3.300000000000000000"]` | ||
| assert.Equal(t, expected, got) | ||
| } |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but this is changing the return type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch — yes,
decimal128Container.Valuechanges its return fromfloat64tostring. Worth being precise about who can observe that:Top-level decimal columns: the native path that produces this value was unreachable before this PR —
UseArrowNativeDecimalhad no DSN key or connector option, so no user could turn it on. So the oldfloat64return was never observable; there's no behavior to break. This PR makes the path reachable (viaWithUseArrowNativeDecimal) and lands it onstring, which is what the existing non-native default path and the column-based path already return for decimals — so the type is now consistent across transports rather than newly divergent.Decimals nested in complex types (array/struct): this container is reached today, because complex types are native by default. Here the type does change (JSON number → JSON string in the marshaled value). But the old
float64value was corrupt — e.g.12345678901234567890.123456789012345678decoded as1.234567890123457e+19, and SQLNULLdecoded as0. So this is fixing silently-wrong output, not regressing correct output.TestDecimal128NestedInListPreservesPrecisionpins this and was verified to fail against the oldToFloat64decode.On why
stringrather than a numeric type:database/sql'sdriver.Valuehas no decimal type (onlyint64/float64/bool/[]byte/string/time.Time), so a lossless decimal has to come back as a string for the caller to decode intoshopspring/decimal,*big.Rat, etc. It's the same choicelib/pq/pgxmake forNUMERIC.Happy to add a typed wrapper instead if you'd prefer that direction — but it'd be a new public type and a larger surface, so I kept this PR to the lossless-string fix. Let me know.