Skip to content

feat(parquet): batch consecutive null/empty rows in write_list#9752

Merged
alamb merged 1 commit into
apache:mainfrom
HippoBaro:batch_consecute_null_rows
Apr 22, 2026
Merged

feat(parquet): batch consecutive null/empty rows in write_list#9752
alamb merged 1 commit into
apache:mainfrom
HippoBaro:batch_consecute_null_rows

Conversation

@HippoBaro
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Rationale for this change

See #9731

What changes are included in this PR?

Restructure write_list() to accumulate consecutive null and empty rows and flush them in a single visit_leaves() call using extend(repeat_n(...)), instead of calling visit_leaves() per row.

With sparse data (99% nulls), a 4096-row batch previously triggered ~4000 individual tree traversals, each pushing a single value per leaf. Now consecutive null/empty runs are collapsed into one traversal that extends all leaf level buffers in bulk.

This follows the same pattern already used by write_struct(). The write_non_null_slice path is unchanged since each non-null row has different offsets and cannot be batched.

Are these changes tested?

All tests passing; existing tests give 100% coverage.

Are there any user-facing changes?

N/A

Restructure `write_list()` to accumulate consecutive null and empty rows
and flush them in a single `visit_leaves()` call using
`extend(repeat_n(...))`, instead of calling `visit_leaves()` per row.

With sparse data (99% nulls), a 4096-row batch previously triggered
~4000 individual tree traversals, each pushing a single value per leaf.
Now consecutive null/empty runs are collapsed into one traversal that
extends all leaf level buffers in bulk.

This follows the same pattern already used by `write_struct()`. The
`write_non_null_slice` path is unchanged since each non-null row has
different offsets and cannot be batched.

Signed-off-by: Hippolyte Barraud <hippolyte.barraud@datadoghq.com>
@HippoBaro
Copy link
Copy Markdown
Contributor Author

cc @alamb; this is another part of #9653 that needs your attention 🙇

Copy link
Copy Markdown
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @HippoBaro -- this looks like a nice improvement to me

One thought i had that could make this code asier to follow (maybe as a follow on PR) would be to try and encapsulate the logic (pending_empties, pending_nulls, and then the logic to write the slices) somehow (maybe in a struct? Or a method on LevelInfoBuilder)

let def_levels = leaf.def_levels.as_mut().unwrap();
def_levels.push(ctx.def_level - 1);
})
let write_null_run = |child: &mut LevelInfoBuilder, count: usize| {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took me a little while to grok that the empty and null cases got reversed here, but once I got over that it seems good to me

@alamb
Copy link
Copy Markdown
Contributor

alamb commented Apr 22, 2026

run benchmark arrow_writer

@adriangbot
Copy link
Copy Markdown

🤖 Arrow criterion benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4293454441-1730-hl9kg 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing batch_consecute_null_rows (6244daf) to 89b1497 (merge-base) diff
BENCH_NAME=arrow_writer
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench arrow_writer
BENCH_FILTER=
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Arrow criterion benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

group                                              batch_consecute_null_rows              main
-----                                              -------------------------              ----
bool/bloom_filter                                  1.00     13.7±0.03ms    18.2 MB/sec    1.00     13.7±0.03ms    18.2 MB/sec
bool/cdc                                           1.00     16.3±0.04ms    15.3 MB/sec    1.00     16.3±0.04ms    15.3 MB/sec
bool/default                                       1.00     11.7±0.03ms    21.5 MB/sec    1.00     11.6±0.03ms    21.5 MB/sec
bool/parquet_2                                     1.00     15.3±0.03ms    16.4 MB/sec    1.00     15.3±0.04ms    16.4 MB/sec
bool/zstd                                          1.00     12.2±0.03ms    20.6 MB/sec    1.00     12.1±0.04ms    20.6 MB/sec
bool/zstd_parquet_2                                1.00     15.7±0.04ms    16.0 MB/sec    1.00     15.6±0.03ms    16.0 MB/sec
bool_non_null/bloom_filter                         1.00      7.0±0.04ms    17.7 MB/sec    1.03      7.2±0.03ms    17.3 MB/sec
bool_non_null/cdc                                  1.01      6.8±0.09ms    18.3 MB/sec    1.00      6.8±0.02ms    18.4 MB/sec
bool_non_null/default                              1.00      4.3±0.02ms    29.3 MB/sec    1.00      4.3±0.02ms    29.2 MB/sec
bool_non_null/parquet_2                            1.00      9.1±0.04ms    13.8 MB/sec    1.00      9.1±0.03ms    13.8 MB/sec
bool_non_null/zstd                                 1.00      4.6±0.02ms    27.0 MB/sec    1.00      4.6±0.02ms    26.9 MB/sec
bool_non_null/zstd_parquet_2                       1.00      9.4±0.03ms    13.3 MB/sec    1.00      9.5±0.03ms    13.2 MB/sec
float_with_nans/bloom_filter                       1.01     94.9±0.29ms   147.6 MB/sec    1.00     93.6±0.29ms   149.7 MB/sec
float_with_nans/cdc                                1.02     83.5±0.31ms   167.7 MB/sec    1.00     82.3±0.14ms   170.2 MB/sec
float_with_nans/default                            1.00     75.3±0.21ms   185.8 MB/sec    1.00     75.1±0.19ms   186.4 MB/sec
float_with_nans/parquet_2                          1.01     98.3±0.33ms   142.4 MB/sec    1.00     97.4±0.30ms   143.7 MB/sec
float_with_nans/zstd                               1.00    113.3±0.40ms   123.6 MB/sec    1.00    112.9±0.23ms   124.0 MB/sec
float_with_nans/zstd_parquet_2                     1.01    136.1±1.69ms   102.9 MB/sec    1.00    134.4±0.30ms   104.2 MB/sec
list_primitive/bloom_filter                        1.04   329.4±12.76ms  1655.5 MB/sec    1.00    315.8±1.22ms  1727.1 MB/sec
list_primitive/cdc                                 1.03    366.6±3.39ms  1487.4 MB/sec    1.00    357.1±6.96ms  1527.1 MB/sec
list_primitive/default                             1.04    247.9±5.08ms     2.1 GB/sec    1.00    239.3±0.88ms     2.2 GB/sec
list_primitive/parquet_2                           1.01    267.8±4.87ms  2036.8 MB/sec    1.00    266.4±6.38ms  2047.4 MB/sec
list_primitive/zstd                                1.00    498.5±4.96ms  1094.1 MB/sec    1.00    496.2±2.74ms  1099.0 MB/sec
list_primitive/zstd_parquet_2                      1.02   494.2±10.00ms  1103.5 MB/sec    1.00    482.7±0.48ms  1129.7 MB/sec
list_primitive_non_null/bloom_filter               1.00    432.7±6.26ms  1257.7 MB/sec    1.04    449.6±6.83ms  1210.4 MB/sec
list_primitive_non_null/cdc                        1.01    442.7±8.85ms  1229.5 MB/sec    1.00    438.6±9.13ms  1240.8 MB/sec
list_primitive_non_null/default                    1.03   323.7±10.34ms  1681.2 MB/sec    1.00    314.3±7.89ms  1731.8 MB/sec
list_primitive_non_null/parquet_2                  1.00   313.4±12.70ms  1736.4 MB/sec    1.05   327.8±21.88ms  1660.2 MB/sec
list_primitive_non_null/zstd                       1.00    715.6±9.14ms   760.6 MB/sec    1.01   720.1±18.65ms   755.8 MB/sec
list_primitive_non_null/zstd_parquet_2             1.00    689.4±7.33ms   789.4 MB/sec    1.04    715.8±7.90ms   760.4 MB/sec
list_primitive_sparse_99pct_null/bloom_filter      1.00     33.9±0.10ms  1102.7 MB/sec    1.24     42.2±0.52ms   885.8 MB/sec
list_primitive_sparse_99pct_null/cdc               1.00     44.2±0.10ms   844.9 MB/sec    1.19     52.6±0.53ms   709.9 MB/sec
list_primitive_sparse_99pct_null/default           1.00     33.5±0.07ms  1113.8 MB/sec    1.25     41.9±0.54ms   892.8 MB/sec
list_primitive_sparse_99pct_null/parquet_2         1.00     33.6±0.07ms  1113.1 MB/sec    1.25     42.0±0.55ms   888.9 MB/sec
list_primitive_sparse_99pct_null/zstd              1.00     35.4±0.07ms  1055.0 MB/sec    1.23     43.7±0.52ms   854.4 MB/sec
list_primitive_sparse_99pct_null/zstd_parquet_2    1.00     33.7±0.09ms  1107.8 MB/sec    1.25     42.0±0.51ms   889.0 MB/sec
primitive/bloom_filter                             1.00    155.2±0.52ms   289.2 MB/sec    1.01    157.1±7.37ms   285.7 MB/sec
primitive/cdc                                      1.00    163.0±0.53ms   275.3 MB/sec    1.00    162.5±0.56ms   276.1 MB/sec
primitive/default                                  1.00    124.3±0.26ms   361.0 MB/sec    1.00    124.1±0.15ms   361.6 MB/sec
primitive/parquet_2                                1.00    139.9±1.11ms   320.8 MB/sec    1.00    139.2±1.43ms   322.3 MB/sec
primitive/zstd                                     1.00    154.2±0.20ms   291.1 MB/sec    1.00    153.7±0.21ms   292.0 MB/sec
primitive/zstd_parquet_2                           1.00    173.2±0.95ms   259.2 MB/sec    1.00    172.5±0.35ms   260.2 MB/sec
primitive_all_null/bloom_filter                    1.00     38.8±0.04ms  1155.9 MB/sec    1.00     38.8±0.03ms  1155.4 MB/sec
primitive_all_null/cdc                             1.00     55.3±0.24ms   811.8 MB/sec    1.00     55.3±0.16ms   812.2 MB/sec
primitive_all_null/default                         1.00     38.2±0.02ms  1175.0 MB/sec    1.00     38.2±0.02ms  1174.9 MB/sec
primitive_all_null/parquet_2                       1.00     38.2±0.03ms  1174.9 MB/sec    1.00     38.2±0.02ms  1175.1 MB/sec
primitive_all_null/zstd                            1.00     38.3±0.12ms  1170.9 MB/sec    1.00     38.3±0.02ms  1170.8 MB/sec
primitive_all_null/zstd_parquet_2                  1.00     38.3±0.02ms  1172.5 MB/sec    1.00     38.3±0.02ms  1172.1 MB/sec
primitive_non_null/bloom_filter                    1.01    114.5±0.89ms   384.2 MB/sec    1.00    113.7±1.09ms   386.9 MB/sec
primitive_non_null/cdc                             1.00     91.6±0.17ms   480.5 MB/sec    1.00     91.6±0.26ms   480.5 MB/sec
primitive_non_null/default                         1.00     69.5±0.16ms   632.9 MB/sec    1.00     69.4±0.13ms   633.7 MB/sec
primitive_non_null/parquet_2                       1.00     91.2±0.16ms   482.4 MB/sec    1.00     91.3±0.29ms   481.9 MB/sec
primitive_non_null/zstd                            1.00    106.9±0.19ms   411.6 MB/sec    1.00    107.4±0.35ms   409.7 MB/sec
primitive_non_null/zstd_parquet_2                  1.00    131.0±2.51ms   335.9 MB/sec    1.01    132.0±1.86ms   333.3 MB/sec
primitive_sparse_99pct_null/bloom_filter           1.00     45.0±0.16ms   996.5 MB/sec    1.00     44.9±0.16ms   998.8 MB/sec
primitive_sparse_99pct_null/cdc                    1.00     61.1±0.17ms   734.8 MB/sec    1.00     60.9±0.17ms   737.0 MB/sec
primitive_sparse_99pct_null/default                1.01     43.7±0.24ms  1026.7 MB/sec    1.00     43.5±0.08ms  1032.2 MB/sec
primitive_sparse_99pct_null/parquet_2              1.00     43.5±0.15ms  1030.8 MB/sec    1.00     43.5±0.09ms  1032.7 MB/sec
primitive_sparse_99pct_null/zstd                   1.00     46.8±0.08ms   959.1 MB/sec    1.00     46.7±0.09ms   960.2 MB/sec
primitive_sparse_99pct_null/zstd_parquet_2         1.00     45.4±0.14ms   987.8 MB/sec    1.00     45.4±0.08ms   988.8 MB/sec
string/bloom_filter                                1.00   212.2±16.93ms     2.4 GB/sec    1.01   213.8±17.57ms     2.4 GB/sec
string/cdc                                         1.01    220.7±7.26ms     2.3 GB/sec    1.00    217.9±4.34ms     2.3 GB/sec
string/default                                     1.00   135.2±21.10ms     3.8 GB/sec    1.02   137.7±23.14ms     3.7 GB/sec
string/parquet_2                                   1.00    110.8±6.57ms     4.6 GB/sec    1.05    116.5±0.39ms     4.4 GB/sec
string/zstd                                        1.00    430.9±4.69ms  1216.5 MB/sec    1.02   439.6±17.05ms  1192.5 MB/sec
string/zstd_parquet_2                              1.04   409.0±10.20ms  1281.6 MB/sec    1.00    394.5±0.84ms  1328.8 MB/sec
string_and_binary_view/bloom_filter                1.01     67.2±0.21ms   480.3 MB/sec    1.00     66.4±0.19ms   485.6 MB/sec
string_and_binary_view/cdc                         1.00     60.6±0.10ms   531.8 MB/sec    1.00     60.7±0.12ms   531.1 MB/sec
string_and_binary_view/default                     1.00     50.6±0.11ms   637.3 MB/sec    1.00     50.7±0.11ms   635.9 MB/sec
string_and_binary_view/parquet_2                   1.00     61.5±0.09ms   524.8 MB/sec    1.00     61.5±0.12ms   524.0 MB/sec
string_and_binary_view/zstd                        1.00     87.2±0.17ms   369.9 MB/sec    1.00     87.5±0.14ms   368.6 MB/sec
string_and_binary_view/zstd_parquet_2              1.00     75.4±0.11ms   427.7 MB/sec    1.00     75.4±0.11ms   427.9 MB/sec
string_dictionary/bloom_filter                     1.00     89.6±1.02ms     2.9 GB/sec    1.00     89.9±0.45ms     2.9 GB/sec
string_dictionary/cdc                              1.00     51.0±0.10ms     5.1 GB/sec    1.87     95.1±2.04ms     2.7 GB/sec
string_dictionary/default                          1.00     47.2±0.71ms     5.5 GB/sec    1.04     49.1±0.45ms     5.3 GB/sec
string_dictionary/parquet_2                        1.00     54.5±0.39ms     4.7 GB/sec    1.01     54.9±0.24ms     4.7 GB/sec
string_dictionary/zstd                             1.00    207.7±1.02ms  1271.9 MB/sec    1.00    208.0±1.17ms  1269.9 MB/sec
string_dictionary/zstd_parquet_2                   1.00    198.7±0.22ms  1329.0 MB/sec    1.00    198.3±0.20ms  1332.1 MB/sec
string_non_null/bloom_filter                       1.00   245.0±10.57ms     2.1 GB/sec    1.02   249.7±12.93ms     2.0 GB/sec
string_non_null/cdc                                1.02   274.6±10.87ms  1908.1 MB/sec    1.00    268.2±7.79ms  1953.8 MB/sec
string_non_null/default                            1.00   130.9±12.16ms     3.9 GB/sec    1.06   139.3±13.82ms     3.7 GB/sec
string_non_null/parquet_2                          1.01    142.2±6.44ms     3.6 GB/sec    1.00    140.6±7.57ms     3.6 GB/sec
string_non_null/zstd                               1.01    541.4±2.11ms   967.8 MB/sec    1.00    538.5±2.59ms   973.1 MB/sec
string_non_null/zstd_parquet_2                     1.02    513.0±5.10ms  1021.4 MB/sec    1.00    501.9±0.41ms  1044.1 MB/sec
struct_all_null/bloom_filter                       1.00     15.9±0.01ms  1015.8 MB/sec    1.00     15.9±0.05ms  1013.7 MB/sec
struct_all_null/cdc                                1.00     22.4±0.17ms   718.9 MB/sec    1.00     22.5±0.09ms   715.6 MB/sec
struct_all_null/default                            1.00     15.6±0.01ms  1034.4 MB/sec    1.00     15.6±0.06ms  1032.2 MB/sec
struct_all_null/parquet_2                          1.00     15.6±0.01ms  1034.6 MB/sec    1.00     15.6±0.05ms  1033.4 MB/sec
struct_all_null/zstd                               1.00     15.6±0.01ms  1030.4 MB/sec    1.00     15.7±0.05ms  1028.8 MB/sec
struct_all_null/zstd_parquet_2                     1.00     15.6±0.01ms  1032.1 MB/sec    1.00     15.7±0.05ms  1030.2 MB/sec
struct_non_null/bloom_filter                       1.00     62.4±0.21ms   256.3 MB/sec    1.01     63.3±1.07ms   252.7 MB/sec
struct_non_null/cdc                                1.00     59.5±0.16ms   268.9 MB/sec    1.00     59.5±0.17ms   268.9 MB/sec
struct_non_null/default                            1.00     46.9±0.09ms   341.5 MB/sec    1.00     47.0±0.15ms   340.8 MB/sec
struct_non_null/parquet_2                          1.00     55.7±0.16ms   287.1 MB/sec    1.00     55.6±0.26ms   287.8 MB/sec
struct_non_null/zstd                               1.00     56.0±0.11ms   285.6 MB/sec    1.00     55.8±0.10ms   286.8 MB/sec
struct_non_null/zstd_parquet_2                     1.00     69.7±0.11ms   229.7 MB/sec    1.00     69.7±0.31ms   229.7 MB/sec
struct_sparse_99pct_null/bloom_filter              1.01     18.9±0.04ms   852.8 MB/sec    1.00     18.8±0.03ms   858.9 MB/sec
struct_sparse_99pct_null/cdc                       1.05     26.5±0.58ms   609.6 MB/sec    1.00     25.2±0.07ms   641.0 MB/sec
struct_sparse_99pct_null/default                   1.00     18.3±0.03ms   882.1 MB/sec    1.00     18.3±0.04ms   883.5 MB/sec
struct_sparse_99pct_null/parquet_2                 1.00     18.3±0.03ms   881.8 MB/sec    1.00     18.2±0.03ms   884.2 MB/sec
struct_sparse_99pct_null/zstd                      1.00     19.6±0.03ms   821.9 MB/sec    1.00     19.6±0.04ms   822.5 MB/sec
struct_sparse_99pct_null/zstd_parquet_2            1.00     19.0±0.04ms   846.8 MB/sec    1.00     19.0±0.04ms   848.6 MB/sec

Resource Usage

base (merge-base)

Metric Value
Wall time 1985.4s
Peak memory 6.6 GiB
Avg memory 6.4 GiB
CPU user 1909.1s
CPU sys 72.6s
Peak spill 0 B

branch

Metric Value
Wall time 1970.4s
Peak memory 6.6 GiB
Avg memory 6.4 GiB
CPU user 1917.3s
CPU sys 50.3s
Peak spill 0 B

File an issue against this benchmark runner

@alamb
Copy link
Copy Markdown
Contributor

alamb commented Apr 22, 2026

🚀 -- thanks @HippoBaro

@alamb alamb merged commit e9cbabd into apache:main Apr 22, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

parquet Changes to the parquet crate performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants