MDEV-10838: Avoiding computation when window frame is always empty#5167
MDEV-10838: Avoiding computation when window frame is always empty#5167EricHayter wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces an optimization for window functions by detecting when window frames are always empty, allowing the execution to bypass sorting and cursor management via a fast-path sequential scan. However, a critical bug was identified in Window_frame::is_frame_always_empty(), where fix_fields_if_needed is called inside a DBUG_ASSERT macro. In release builds, this assertion is compiled out, which would prevent the item from being fixed and likely lead to crashes or undefined behavior during execution.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
gkodinov
left a comment
There was a problem hiding this comment.
Thank you for your contribution! This is a preliminary review.
Please squash all of your commits into a single commit per logical change. This might be a single commit or two commits in your case: depends on how much you want to highlight the two logical optimizations you're doing (detect a single empty frame and detect a situation when all frames are empty).
Then please make sure the commit(s) have a good explanation in their commit message(s), compliant with the MariaDB server coding standards.
Please also consider adding a regression test for this.
One way of doing this that comes to mind is testing some sort of status counters (e.g. on I/O ops) post query that might get bumped or not.
Another way is to dip a DBUG_EXECUTE_IF() that will cause the execution to fail if it gets to the DBUG_EXECUTE_IF and it's on. Then you set the flag and do a test SQL hoping it won't fail (as in: go into the non-optimized code path).
If there's no possibility for a practical regression test, please explain why in the commit message.
A window frame is always-empty when its bounds guarantee it can never contain any rows, for example: ROWS BETWEEN 2 PRECEDING AND 3 PRECEDING Here the start bound (2 PRECEDING) is always further from the current row than the end bound (3 PRECEDING), so the frame is empty for every row regardless of the data. Previously such queries still performed a full filesort before computing NULL for every row. This patch adds a fast path that detects always-empty frames at execution time and skips the filesort entirely, doing a single sequential scan that writes NULL for each window function result. Two new mechanisms are introduced: - Window_frame::is_frame_always_empty(): detects whether a frame's bounds guarantee an empty result for all rows, covering both ROWS and RANGE units. - Window_func_runner::exec_always_empty(): a simplified execution path for the all-empty case. It skips cursor setup and filesort, scanning the table once to write the cleared (NULL) aggregator value per row. The Cursor_manager::always_empty flag avoids cursor I/O overhead for individual always-empty windows in groups that also contain non-empty windows. A debug regression test asserts via DBUG_EXECUTE_IF that the filesort slow path is never reached when all window frames are always-empty.
6d4c032 to
58f8fc7
Compare
This PR adds a simple optimization for window functions of the form
[ROWS | RANGE] BETWEEN N PRECEDING AND M PRECEDING(whereN<M) orBETWEEN N FOLLOWING AND M FOLLOWING(whereN>M). In both cases the frame is statically empty, so the temp file sort in window function execution is skipped entirely andNULLs are written directly to the output for each row.Resolves MDEV-10838