Spark: Fix DELETE_GRANULARITY_DEFAULT and use it in SparkWriteConf#16520
Open
turboFei wants to merge 1 commit into
Open
Spark: Fix DELETE_GRANULARITY_DEFAULT and use it in SparkWriteConf#16520turboFei wants to merge 1 commit into
turboFei wants to merge 1 commit into
Conversation
…teConf apache#11478 changed the effective default delete granularity to FILE but left TableProperties.DELETE_GRANULARITY_DEFAULT pointing to PARTITION, and hardcoded DeleteGranularity.FILE directly in SparkWriteConf instead of using the constant. This fixes both issues: - Update DELETE_GRANULARITY_DEFAULT to DeleteGranularity.FILE to match the intended default introduced in apache#11478 - Use TableProperties.DELETE_GRANULARITY_DEFAULT in SparkWriteConf for Spark v3.5, v4.0, and v4.1 so there is a single source of truth
Member
|
#11478 changed Spark's default configuration from You can see the difference from the docs. |
Member
|
@szehon-ho has a PR to keep Spark specific table properties in the Spark module. It will be good to add |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Background
#11478 changed the effective default delete granularity from
PARTITIONtoFILEfor Spark writes, but introduced two inconsistencies:TableProperties.DELETE_GRANULARITY_DEFAULTwas left pointing toDeleteGranularity.PARTITION, making the constant misleading and effectively dead code.v3.5,v4.0,v4.1) hardcodedDeleteGranularity.FILEdirectly inSparkWriteConfinstead of referencing the constant.Changes
TableProperties.DELETE_GRANULARITY_DEFAULTtoDeleteGranularity.FILEto correctly reflect the intended defaultTableProperties.DELETE_GRANULARITY_DEFAULTinSparkWriteConffor Spark v3.5, v4.0, and v4.1 to establish a single source of truthTest Plan
SparkWriteConfcover the default value resolution pathFILE) remains the same; this only removes the inconsistency between the constant and its usageFollowup to #11478