Optimize parquet row filter auto strategy with adaptive fallback#9956
Draft
hhhizzz wants to merge 6 commits intoapache:mainfrom
Draft
Optimize parquet row filter auto strategy with adaptive fallback#9956hhhizzz wants to merge 6 commits intoapache:mainfrom
hhhizzz wants to merge 6 commits intoapache:mainfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Rationale for this change
RowFiltercan be much slower than a full scan for fragmented selections, especially when page indexes provide little pruning and the resultingReadPlancontains many tiny select/skip runs. #8565 shows an extreme case where predicate pushdown is around 10x slower than scanning and filtering afterwards.This PR improves the
RowSelectionPolicy::Autopath so it can make better strategy decisions for fragmented row selections and avoid continuing with row-filter pushdown when the observed shape suggests the pushdown path is unlikely to help.The main goal is to reduce the performance cliff without changing explicit
Mask/Selectorsbehavior.While working on the Auto strategy, this PR also fixes a correctness issue in the
Maskexecution path. With sparse page-loaded ranges, a mask-backed read plan could previously attempt to consume selected rows outside the loaded ranges and fail during decoding. This madeMaskrisky for some page-index / fragmented-selection cases.This PR makes the loaded row ranges explicit in the read plan and adds coverage for sparse loaded-range execution. After this change, using
Maskshould no longer hit this known failure mode.Design / implementation notes
This PR started from the observation that row-level predicate pushdown can hit a performance cliff when the resulting
RowSelectionis highly fragmented. In that shape, selector-backed execution spends a lot of time repeatedly skipping and reading tiny row runs, so it can be slower than a full decode followed by filtering.A simple "prefer Mask more often" rule is not sufficient, though. When page-level pruning is involved, the in-memory column chunks may be sparse: some pages were never loaded because the current selection skipped them. The old code avoided this by forcing selectors in cases where a mask could try to decode rows from pages that were not present. This PR makes that distinction explicit:
Autoremains conservative when page pruning has produced sparse loaded ranges.RowSelectionPolicy::Maskno longer assumes the loaded column data is dense. It now tracks loaded row ranges and uses a sparse mask cursor, so future users of Mask do not hit missing-page / invalid-offset failures.The other part of the change is runtime fallback. Static selection heuristics only see the final
RowSelectionshape; they do not know whether predicate pushdown is actually saving work for this file/query. The push decoder now observes early row-group selection shape and can switch later row groups to decode once and apply the predicate after decode when pushdown is unlikely to pay off, for example high-selectivity/no-pruning cases or fragmented moderate/high-selectivity cases.What changes are included in this PR?
RowSelectionshape analysis and strategy decision metrics.RowSelectionPolicy::Autoso it can choose between mask and selectors using selection shape and loaded page ranges.try_next_readerhandoff paths;ArrowPredicatetwice for the same current row group when caller-provided row selection is present.arrow_reader_row_filterbenchmarks to cover strategy-sensitive cases.Maskcorrectness failure with sparse loaded page ranges.Maskbehavior while ensuring sparse loaded ranges are tracked so mask execution does not read outside available page ranges.Are these changes tested?
Yes.
Unit / integration validation:
cargo fmt -p parquet -- --checkgit diff --checkcargo test -p parquet --lib arrow::push_decodercargo test -p parquet --lib arrow::arrow_reader::read_plancargo test -p parquet --lib arrow::arrow_reader::selectioncargo test -p parquet --libNew focused tests cover:
try_next_readerdoes not use post-filter fallback in normal reader handoff mode.Maskexecution remains correct with sparse loaded page ranges.Maskwhen sparse loaded ranges make selector execution safer.Benchmark evidence from
arrow_reader_row_filtercomparingorigin/mainvs this branch:utf8View != '', all columnsint64 > 90, all columnsint64 > 90, exclude filter columnutf8View != '', exclude filter columnfloat64 > 99.0, exclude filter columnAcross the 16 async
arrow_reader_row_filtercases run, geometric mean speedup was about 1.08x. The worst observed regression in that run was about -3.3%.Are there any user-facing changes?
No intended breaking API changes.
RowSelectionPolicy::Automay choose different internal execution strategies than before. ExplicitMaskandSelectorspolicies remain available for callers that want fixed behavior.