Skip to content

Core: Set scan planning mode when initializing client#15903

Open
gaborkaszab wants to merge 1 commit into
apache:mainfrom
gaborkaszab:main_scan_planning_mode_setting
Open

Core: Set scan planning mode when initializing client#15903
gaborkaszab wants to merge 1 commit into
apache:mainfrom
gaborkaszab:main_scan_planning_mode_setting

Conversation

@gaborkaszab
Copy link
Copy Markdown
Contributor

Since it's a startup parameter for RESTSessionCatalog, setting client-side scan planning mode is move to the initialize() function. With this we can throw an exception right away when an invalid value is set, not just when we try to load a table.

@gaborkaszab
Copy link
Copy Markdown
Contributor Author

cc @singhpk234
It felt more natural to set the client side part of the scan planning mode in the initialize() of the catalog, since it's a startup flag. We could also give an error when we provide an invalid value right away when creating a catalog. Otherwise, a server-side setting of the same would simply hide the invalid setting.

Also, in the ScanPlanningMode enum it felt an overkill to have the fromString and modeName functions, because the enumerator itself already has the required functionality for this. Let me know if you disagree and I'll revert.

@gaborkaszab gaborkaszab force-pushed the main_scan_planning_mode_setting branch from e8ca037 to a2e8f33 Compare April 6, 2026 22:37
Comment on lines -73 to -91
public String modeName() {
return name().toLowerCase(Locale.ROOT);
}

public static ScanPlanningMode fromString(String mode) {
for (ScanPlanningMode planningMode : values()) {
if (planningMode.modeName().equalsIgnoreCase(mode)) {
return planningMode;
}
}

throw new IllegalArgumentException(
String.format(
"Invalid scan planning mode: %s. Valid values are: %s",
mode,
Arrays.stream(values())
.map(ScanPlanningMode::modeName)
.collect(Collectors.joining(", "))));
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are changing the exception message by removing this, i would much rather have the possible values defined to prompt the user to correct message.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for spotting this! My initial intention was to rather start a conversation on this, but I forgot to write the comment.

  1. In general I don't think we print the available values when we can't initialize an enum. These values are in the docs, seems an overkill to also return here.
  2. Instead of having a for loop of values, we can just simply use the valueOf of the enum. This simplifies the fromString implementation.

I went back to use the fromString function because it can hide some details, like calling uppercase, and this way we can have a custom error message instead of the one from the enum. I felt printing the available values is an overkill, and also went for the simplification in 2)
Taking a step back, the changes on the enum seems orthogonal to the original intent of the PR, so I can split that into a separate one (and also tackle the same for SnapshotMode).

WDYT @singhpk234

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is somewhat unrelated to the scope of the PR. May I move this into a separate PR and continue the discussion there?

Comment on lines +622 to +629
// Determine effective mode: prefer server config if present, otherwise use client config,
// fall back to default if both are null
ScanPlanningMode effectiveMode =
scanPlanningModeServer != null
? scanPlanningModeServer
: scanPlanningMode != null
? scanPlanningMode
: RESTCatalogProperties.SCAN_PLANNING_MODE_DEFAULT;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we still maintain the 2 if ? it hard to follow a ternary within ternary op

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, changed this to the 2-if approach

@gaborkaszab gaborkaszab force-pushed the main_scan_planning_mode_setting branch 4 times, most recently from ec086d5 to 4e42a5b Compare April 14, 2026 11:20
@gaborkaszab gaborkaszab force-pushed the main_scan_planning_mode_setting branch from 4e42a5b to 3f16e42 Compare April 22, 2026 14:21
@gaborkaszab
Copy link
Copy Markdown
Contributor Author

Hey @singhpk234 ,
I've rebase to get rid of the merge conflict. WDYT about the current version?

@gaborkaszab
Copy link
Copy Markdown
Contributor Author

Gentle reminder @singhpk234 :) Do you think this version of the PR is ok?

Since it's a startup parameter for RESTSessionCatalog, setting
client-side scan planning mode is move to the initialize()
function. With this we can throw an exception right away when
an invalid value is set, not just when we try to load a table.
@gaborkaszab gaborkaszab force-pushed the main_scan_planning_mode_setting branch from 3f16e42 to aec9b03 Compare May 22, 2026 09:06
@gaborkaszab gaborkaszab changed the title Core, Spark: Set scan planning mode when initializing client Core: Set scan planning mode when initializing client May 22, 2026
@gaborkaszab
Copy link
Copy Markdown
Contributor Author

Hey @singhpk234 ,
Just gentle reminder. Would you mind taking a look?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants