Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions acceptance/acceptance_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -277,6 +277,13 @@ func testAccept(t *testing.T, inprocessMode bool, singleTest string) int {
cli293Path := DownloadCLI(t, buildDir, "0.293.0")
t.Setenv("CLI_293", cli293Path)
repls.SetPath(cli293Path, "[CLI_293]")

// v1.0.0 understands state schema versions only up to 2. Used by tests
// asserting that an older CLI rejects newer state (e.g. the v3 state
// written when a bundle opts into the deployment metadata service).
cliV1Path := DownloadCLI(t, buildDir, "1.0.0")
t.Setenv("CLI_V1", cliV1Path)
repls.SetPath(cliV1Path, "[CLI_V1]")
}

paths := []string{
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
bundle:
name: test-rdh-state-upgrade

experimental:
record_deployment_history: true

resources:
jobs:
foo:
name: foo-job-renamed
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
bundle:
name: test-rdh-state-upgrade

resources:
jobs:
foo:
name: foo-job

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@

=== without the flag: state stays at the baseline schema version
>>> [CLI] bundle deploy
Uploading bundle files to /Workspace/Users/[USERNAME]/.bundle/test-rdh-state-upgrade/default/files...
Deploying resources...
Updating deployment state...
Deployment complete!

>>> print_state.py
{
"state_version": 2,
"dms_version": null
}

=== with experimental.record_deployment_history: state upgrades and records the DMS version
>>> [CLI] bundle deploy
Uploading bundle files to /Workspace/Users/[USERNAME]/.bundle/test-rdh-state-upgrade/default/files...
Deploying resources...
Updating deployment state...
Deployment complete!

>>> print_state.py
{
"state_version": 3,
"dms_version": 1
}

=== an older CLI (v1.0.0, max state version 2) rejects operations on the upgraded state
>>> errcode [CLI_V1] bundle plan
Warning: unknown field: record_deployment_history
at experimental
in databricks.yml:5:3

Error: migrating state [TEST_TMP_DIR]/.databricks/bundle/default/resources.json: state version 3 is newer than supported version 2; upgrade the CLI


Exit code: 1

=== with the flag, a state written by a newer record_deployment_history version is rejected
>>> update_file.py .databricks/bundle/default/resources.json "dms_version": 1 "dms_version": 2

>>> errcode [CLI] bundle plan
Error: record_deployment_history state version 2 is newer than supported version 1; upgrade the CLI


Exit code: 1

=== without the flag, the same state is accepted (the check is gated on opt-in)
>>> update_file.py databricks.yml record_deployment_history: true record_deployment_history: false

>>> errcode [CLI] bundle plan
Plan: 0 to add, 0 to change, 0 to delete, 1 unchanged
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# State-version lifecycle for the DMS preview (experimental.record_deployment_history):
# 1. Without the flag, deploys keep writing the baseline state version.
# 2. Opting in upgrades the state version and records the DMS version.
# 3. An older CLI that predates the DMS state version refuses to operate on the
# upgraded state and tells the user to upgrade, so it cannot mishandle it.
# 4. With the flag set, a state stamped with a newer record_deployment_history
# version than this CLI knows is also refused (the state schema version is
# one we support). Without the flag the check is skipped.

title "without the flag: state stays at the baseline schema version"
trace $CLI bundle deploy
trace print_state.py | jq '{state_version, dms_version}'

title "with experimental.record_deployment_history: state upgrades and records the DMS version"
# Also renames the job so the deploy writes state (a no-op deploy would not
# rewrite the state file and the upgrade would not be observable yet).
cp databricks.dms.yml databricks.yml
trace $CLI bundle deploy
trace print_state.py | jq '{state_version, dms_version}'

title "an older CLI (v1.0.0, max state version 2) rejects operations on the upgraded state"
trace errcode $CLI_V1 bundle plan 2>&1 | contains.py "state version 3 is newer than supported version 2; upgrade the CLI"

title "with the flag, a state written by a newer record_deployment_history version is rejected"
# The state must hold a record_deployment_history version greater than this CLI's
# supported version (dmsVersion) to be rejected, so we set supported+1.
# TODO: when dmsVersion is bumped, change the value written below to the new
# supported+1. The error's version numbers are not asserted here, so the assertion
# needs no change.
trace update_file.py .databricks/bundle/default/resources.json '"dms_version": 1' '"dms_version": 2'
trace errcode $CLI bundle plan 2>&1 | contains.py "record_deployment_history state version" "is newer than supported version" "upgrade the CLI"

title "without the flag, the same state is accepted (the check is gated on opt-in)"
trace update_file.py databricks.yml "record_deployment_history: true" "record_deployment_history: false"
trace errcode $CLI bundle plan 2>&1 | contains.py "!is newer than supported version"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if experimental flag is removed? Do we stop calling DMS but keep using normal direct engine? Do we keep version 3 or 2 in that case?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we stop calling DMS but keep using normal direct engine?

Yes, we fall back to normal direct engine. It works fine because we'll continue to maintain direct state with WAL and resources.json as we preview.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we keep version 3 or 2 in that case?

We keep version 3. The configuration remains the source of truth for whether to use direct or DMS (provide easy fallback for customers).

Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
Local = true
Cloud = false

# databricks.yml is rewritten in-place by the script (flag added in the second step).
Ignore = [".databricks", "databricks.yml"]

# The state-version upgrade only applies to the direct engine; terraform stores
# state differently and would diverge.
[EnvMatrix]
DATABRICKS_BUNDLE_ENGINE = ["direct"]
Comment thread
shreyas-goenka marked this conversation as resolved.
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
bundle:
name: test-rdh-terraform-unsupported

experimental:
record_deployment_history: true

resources:
jobs:
foo:
name: foo-job

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@

>>> errcode [CLI] bundle plan
Error: experimental.record_deployment_history is only supported with the direct deployment engine


Exit code: 1
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# record_deployment_history drives DMS, which only the direct engine supports.
# Under the terraform engine the flag must be rejected, not silently ignored.
trace errcode $CLI bundle plan 2>&1 | contains.py "experimental.record_deployment_history is only supported with the direct deployment engine"
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
Local = true
Cloud = false
RecordRequests = false

Ignore = [".databricks"]

[EnvMatrix]
DATABRICKS_BUNDLE_ENGINE = ["terraform"]
2 changes: 1 addition & 1 deletion acceptance/bundle/state/future_version/output.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
state version 999 is newer than supported version 2; upgrade the CLI
state version 999 is newer than supported version 3; upgrade the CLI

Exit code: 1
2 changes: 1 addition & 1 deletion bundle/configsync/diff.go
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ func DetectChanges(ctx context.Context, b *bundle.Bundle, engine engine.EngineTy
} else {
deployBundle = &direct.DeploymentBundle{}
_, statePath := b.StateFilenameConfigSnapshot(ctx)
if err := deployBundle.StateDB.Open(ctx, statePath, dstate.WithRecovery(true), dstate.WithWrite(false)); err != nil {
if err := deployBundle.StateDB.Open(ctx, statePath, dstate.WithRecovery(true), dstate.WithWrite(false), dstate.WithDMS(false)); err != nil {
return nil, fmt.Errorf("failed to open state: %w", err)
}
}
Expand Down
2 changes: 1 addition & 1 deletion bundle/configsync/variables.go
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,7 @@ func resourceIDLookup(ctx context.Context, b *bundle.Bundle) func(string) string
}
_, statePath := b.StateFilenameConfigSnapshot(ctx)
db := &dstate.DeploymentState{}
if err := db.Open(ctx, statePath, dstate.WithRecovery(false), dstate.WithWrite(false)); err != nil {
if err := db.Open(ctx, statePath, dstate.WithRecovery(false), dstate.WithWrite(false), dstate.WithDMS(false)); err != nil {
log.Debugf(ctx, "variable restoration: failed to open state DB at %s: %v", statePath, err)
return nil
}
Expand Down
12 changes: 6 additions & 6 deletions bundle/direct/bind.go
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ type BindResult struct {
func (b *DeploymentBundle) Bind(ctx context.Context, client *databricks.WorkspaceClient, configRoot *config.Root, statePath, resourceKey, resourceID string) (*BindResult, error) {
// Check if the resource is already managed (bound to a different ID)
var checkStateDB dstate.DeploymentState
if err := checkStateDB.Open(ctx, statePath, dstate.WithRecovery(true), dstate.WithWrite(false)); err == nil {
if err := checkStateDB.Open(ctx, statePath, dstate.WithRecovery(true), dstate.WithWrite(false), dstate.WithDMS(false)); err == nil {
existingID := checkStateDB.GetResourceID(resourceKey)
if _, err := checkStateDB.Finalize(ctx); err != nil {
log.Warnf(ctx, "failed to finalize state: %v", err)
Expand All @@ -86,7 +86,7 @@ func (b *DeploymentBundle) Bind(ctx context.Context, client *databricks.Workspac
}

// Open temp state
err := b.StateDB.Open(ctx, tmpStatePath, dstate.WithRecovery(false), dstate.WithWrite(true))
err := b.StateDB.Open(ctx, tmpStatePath, dstate.WithRecovery(false), dstate.WithWrite(true), dstate.WithDMS(false))
if err != nil {
os.Remove(tmpStatePath)
return nil, err
Expand All @@ -109,7 +109,7 @@ func (b *DeploymentBundle) Bind(ctx context.Context, client *databricks.Workspac
log.Infof(ctx, "Bound %s to id=%s (in temp state)", resourceKey, resourceID)

// First plan + update: populate state with resolved config
err = b.StateDB.Open(ctx, tmpStatePath, dstate.WithRecovery(true), dstate.WithWrite(false))
err = b.StateDB.Open(ctx, tmpStatePath, dstate.WithRecovery(true), dstate.WithWrite(false), dstate.WithDMS(false))
if err != nil {
os.Remove(tmpStatePath)
return nil, err
Expand Down Expand Up @@ -144,7 +144,7 @@ func (b *DeploymentBundle) Bind(ctx context.Context, client *databricks.Workspac
}
}

err = b.StateDB.Open(ctx, tmpStatePath, dstate.WithRecovery(true), dstate.WithWrite(true))
err = b.StateDB.Open(ctx, tmpStatePath, dstate.WithRecovery(true), dstate.WithWrite(true), dstate.WithDMS(false))
if err != nil {
os.Remove(tmpStatePath)
return nil, err
Expand All @@ -164,7 +164,7 @@ func (b *DeploymentBundle) Bind(ctx context.Context, client *databricks.Workspac
}

// Second plan: this is the plan to present to the user (change between remote resource and config)
err = b.StateDB.Open(ctx, tmpStatePath, dstate.WithRecovery(true), dstate.WithWrite(false))
err = b.StateDB.Open(ctx, tmpStatePath, dstate.WithRecovery(true), dstate.WithWrite(false), dstate.WithDMS(false))
if err != nil {
os.Remove(tmpStatePath)
return nil, err
Expand Down Expand Up @@ -214,7 +214,7 @@ func (result *BindResult) Cancel() {
// Unbind removes a resource from direct engine state without deleting
// the workspace resource. Also removes associated permissions/grants entries.
func (b *DeploymentBundle) Unbind(ctx context.Context, statePath, resourceKey string) error {
err := b.StateDB.Open(ctx, statePath, dstate.WithRecovery(true), dstate.WithWrite(true))
err := b.StateDB.Open(ctx, statePath, dstate.WithRecovery(true), dstate.WithWrite(true), dstate.WithDMS(false))
if err != nil {
return err
}
Expand Down
21 changes: 15 additions & 6 deletions bundle/direct/dstate/migrate.go
Original file line number Diff line number Diff line change
Expand Up @@ -10,16 +10,25 @@ import (
"github.com/databricks/databricks-sdk-go/service/iam"
)

// migrateState runs all necessary migrations on the database.
// migrateState brings a freshly-loaded state up to a version this CLI can use.
// It is called after loading state from disk.
//
// Two versions are "current" and left untouched: the baseline currentStateVersion
// and the opt-in dmsStateVersion (written when a bundle previews DMS). Legacy
// states below the baseline are migrated forward via the migrations map. A state
// newer than dmsStateVersion was written by a newer CLI, so we refuse it rather
// than risk mishandling a format we don't understand.
//
// The DMS protocol version (dms_version) is enforced separately and only when
// the bundle has opted into DMS; see Open's WithDMS option.
func migrateState(db *Database) error {
if db.StateVersion == currentStateVersion {
return nil
}
if db.StateVersion > currentStateVersion {
return fmt.Errorf("state version %d is newer than supported version %d; upgrade the CLI", db.StateVersion, currentStateVersion)
if db.StateVersion > dmsStateVersion {
return fmt.Errorf("state version %d is newer than supported version %d; upgrade the CLI", db.StateVersion, dmsStateVersion)
}

// Only legacy states (below the baseline) migrate here. The DMS upgrade is an
// explicit deploy-time step (see UpgradeToDMS), never an automatic migration,
// so a dmsStateVersion state falls through this loop unchanged.
for version := db.StateVersion; version < currentStateVersion; version++ {
fn, ok := migrations[version]
if !ok {
Expand Down
95 changes: 95 additions & 0 deletions bundle/direct/dstate/migrate_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
package dstate

import (
"testing"

"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)

func TestMigrateStateLeavesCurrentUntouched(t *testing.T) {
db := &Database{Header: Header{StateVersion: currentStateVersion}}
require.NoError(t, migrateState(db))
assert.Equal(t, currentStateVersion, db.StateVersion)
}

func TestMigrateStateLeavesDMSStateUntouched(t *testing.T) {
// A DMS-upgraded state is already current; it must not be downgraded to the
// baseline version, and the recorded DMS version must be preserved.
db := &Database{Header: Header{StateVersion: dmsStateVersion, DmsVersion: dmsVersion}}
require.NoError(t, migrateState(db))
assert.Equal(t, dmsStateVersion, db.StateVersion)
assert.Equal(t, dmsVersion, db.DmsVersion)
}

func TestMigrateStateRejectsNewerThanSupported(t *testing.T) {
db := &Database{Header: Header{StateVersion: dmsStateVersion + 1}}
err := migrateState(db)
require.Error(t, err)
assert.Contains(t, err.Error(), "upgrade the CLI")
}

// TestStateSchemaVersions pins the state schema version constants. They are part
// of the on-disk format and a contract with older CLIs, so changing them must be
// deliberate. If you are intentionally bumping the baseline schema, this test is
// your checklist: follow the steps below, then update the assertions last.
//
// HOW TO BUMP THE BASELINE STATE VERSION (2 -> 4) AND RETIRE THE DMS SPECIAL-CASING
//
// Version 3 only exists because previewing DMS bumped the schema out-of-band via
// UpgradeToDMS, so non-preview bundles weren't forced to upgrade. The next baseline
// bump is when you delete all of that custom code and make 3 an ordinary version
// in the linear migration chain. Go to 4, not 3 (3 is already in the wild):
//
// 1. state.go: set currentStateVersion = 4. Delete the dmsStateVersion constant
// and the UpgradeToDMS method. Version 3 is now reached only by migration,
// like any other version. (Leave dmsVersion and Header.DmsVersion
// alone; they are a separate concern, see step 4.)
//
// 2. migrate.go: change the upper-bound guard from "> dmsStateVersion" back to
// "> currentStateVersion" (there is again only one current version). Add
// stepwise migrations so every old state climbs to 4:
// migrations[2] = v2 (non-DMS baseline) -> v3
// migrations[3] = v3 (former DMS preview) -> v4
// A v2 state climbs 2->3->4 and a v3 state climbs 3->4, exactly like any
// other version. Write a real transform for whatever the v4 change is.
//
// 3. deploy.go: delete the `if RecordDeploymentHistory { UpgradeToDMS() }` block.
// The version is no longer bumped conditionally; every deploy writes the
// baseline through the normal path.
//
// 4. dms_version is NOT part of this bump. It tracks the DMS *protocol* version
// (dmsVersion), independent of the state schema version, and is enforced
// by Open's WithDMS option (passed by cmd/bundle/utils/process.go only when the
// bundle has opted into DMS). Leave the field, the constant, and that check in
// place; if UpgradeToDMS was the only place stamping it, move that stamping into
// the normal write path.
//
// 5. Tests: update the assertions below (the dmsStateVersion assertion and the
// DMS-schema cases go away); add 2->3->4 and 3->4 migration tests.
// TestMigrationsCoverBaseline fails until migrations[2] and migrations[3] exist.
//
// RELATED COVERAGE
// - acceptance/bundle/state/permission_level_migration is a golden migration
// fixture: it commits a real v1 state and asserts the migrated v2 output,
// catching migration-correctness bugs (not just "did you mean to change this").
// - acceptance/bundle/deploy/record-deployment-history/state-upgrade drives the
// full lifecycle, including both rejections (older CLI vs newer state, and a
// newer DMS version vs this CLI).
// - bundle/invariant/continue_293 asserts the current CLI reads state written by
// an older released CLI, so we never break reading older state.
func TestStateSchemaVersions(t *testing.T) {
assert.Equal(t, 2, currentStateVersion)
assert.Equal(t, 3, dmsStateVersion)
assert.Equal(t, 1, dmsVersion)
}

// TestMigrationsCoverBaseline guards a baseline bump: every state version below
// currentStateVersion must have a migration to the next version, so migrateState
// can always climb a legacy state up to the baseline. A bump that forgets a
// migration fails here instead of at a user's deploy.
func TestMigrationsCoverBaseline(t *testing.T) {
for v := range currentStateVersion {
assert.Containsf(t, migrations, v, "missing migration for state version %d", v)
}
}
Loading
Loading