[CELEBORN-2321] Avoid locking disk writers during memory split checks#3680
Closed
sunchao wants to merge 1 commit intoapache:mainfrom
Closed
[CELEBORN-2321] Avoid locking disk writers during memory split checks#3680sunchao wants to merge 1 commit intoapache:mainfrom
sunchao wants to merge 1 commit intoapache:mainfrom
Conversation
(cherry picked from commit 1630aab0b59f3b5c7adf47e85062e4d28a5f1662)
There was a problem hiding this comment.
Pull request overview
This PR reduces lock contention on the worker push-data hot path by adding an unlocked fast-path in PartitionDataWriter.needHardSplitForMemoryShuffleStorage() for non-memory (disk/DFS) writers, while preserving the original semantics for memory writers via a synchronized re-check.
Changes:
- Remove method-level synchronization from
needHardSplitForMemoryShuffleStorage()and add a lock-free early return for non-MemoryTierWritercases. - For
MemoryTierWriter, re-readcurrentTierWriterafter enteringsynchronized (this)to ensure correctness if the tier changes concurrently.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
SteNicholas
approved these changes
May 10, 2026
Member
|
Thanks for fix. Merged to main(v0.7.0). |
Member
Author
|
Thanks! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why are the changes needed?
needHardSplitForMemoryShuffleStorage()runs on the push path. Disk-backed writers can never require this memory-only split check, but the method currently acquires the writer lock before returningfalse. For the common disk-backed case, that adds avoidable contention with writes and evictions on a hot path.What changes were proposed in this PR?
This PR adds an unlocked fast path for non-memory writers so they return immediately without taking the
PartitionDataWritermonitor. For memory-backed writers, it recheckscurrentTierWriterafter entering the synchronized block before evaluating the existing hard-split conditions, which preserves the original behavior if the writer tier changes concurrently.Does this PR resolve a correctness bug?
No.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
build/mvn -pl worker -am -DskipTests compileon currentmain.workerbecauseceleborn-master_2.12cannot resolve snapshot test-jar artifacts forceleborn-common_2.12andceleborn-service_2.12; that failure is unrelated to this change.