Skip to content

Spark: Fix DELETE_GRANULARITY_DEFAULT and use it in SparkWriteConf#16520

Open
turboFei wants to merge 1 commit into
apache:mainfrom
turboFei:default_value
Open

Spark: Fix DELETE_GRANULARITY_DEFAULT and use it in SparkWriteConf#16520
turboFei wants to merge 1 commit into
apache:mainfrom
turboFei:default_value

Conversation

@turboFei
Copy link
Copy Markdown
Member

@turboFei turboFei commented May 21, 2026

Background

#11478 changed the effective default delete granularity from PARTITION to FILE for Spark writes, but introduced two inconsistencies:

  1. TableProperties.DELETE_GRANULARITY_DEFAULT was left pointing to DeleteGranularity.PARTITION, making the constant misleading and effectively dead code.
  2. All three Spark versions (v3.5, v4.0, v4.1) hardcoded DeleteGranularity.FILE directly in SparkWriteConf instead of referencing the constant.

Changes

  • Fix TableProperties.DELETE_GRANULARITY_DEFAULT to DeleteGranularity.FILE to correctly reflect the intended default
  • Use TableProperties.DELETE_GRANULARITY_DEFAULT in SparkWriteConf for Spark v3.5, v4.0, and v4.1 to establish a single source of truth

Test Plan

  • Existing unit tests for SparkWriteConf cover the default value resolution path
  • No behavior change — the effective default (FILE) remains the same; this only removes the inconsistency between the constant and its usage

Followup to #11478

…teConf

apache#11478 changed the effective default delete granularity to FILE but left
TableProperties.DELETE_GRANULARITY_DEFAULT pointing to PARTITION, and
hardcoded DeleteGranularity.FILE directly in SparkWriteConf instead of
using the constant.

This fixes both issues:
- Update DELETE_GRANULARITY_DEFAULT to DeleteGranularity.FILE to match
  the intended default introduced in apache#11478
- Use TableProperties.DELETE_GRANULARITY_DEFAULT in SparkWriteConf for
  Spark v3.5, v4.0, and v4.1 so there is a single source of truth
@manuzhang
Copy link
Copy Markdown
Member

@turboFei

#11478 changed Spark's default configuration from partition to file but table's default is still partition.

You can see the difference from the docs.

@manuzhang
Copy link
Copy Markdown
Member

@szehon-ho has a PR to keep Spark specific table properties in the Spark module. It will be good to add DELETE_GRANULARITY_DEFAULT to SparkTableProperties when it's in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants