Skip to content

[WIP] withdraw: persist tx before state transitions#1105

Draft
starius wants to merge 1 commit intolightninglabs:masterfrom
starius:atomic-persist
Draft

[WIP] withdraw: persist tx before state transitions#1105
starius wants to merge 1 commit intolightninglabs:masterfrom
starius:atomic-persist

Conversation

@starius
Copy link
Collaborator

@starius starius commented Mar 25, 2026

Persist finalized withdrawal tx pointers to all selected deposits before triggering state transitions. This avoids restart races where deposits can remain in WITHDRAWING without a recoverable finalized tx reference.

Recovery now groups withdrawing deposits by tx hash and restores missing tx pointers when there is exactly one in-flight withdrawal cluster. Recovered pointers are written back so subsequent restarts remain stable.

Add a regression test which exercises recoverWithdrawals end-to-end and verifies restart recovery for WITHDRAWING deposits missing FinalizedWithdrawalTx.

Pull Request Checklist

  • Update release_notes.md if your PR contains major features, breaking changes or bugfixes

Persist finalized withdrawal tx pointers to all selected deposits before
triggering state transitions. This avoids restart races where deposits can
remain in WITHDRAWING without a recoverable finalized tx reference.

Recovery now groups withdrawing deposits by tx hash and restores missing tx
pointers when there is exactly one in-flight withdrawal cluster. Recovered
pointers are written back so subsequent restarts remain stable.

Add a regression test which exercises recoverWithdrawals end-to-end and verifies
restart recovery for WITHDRAWING deposits missing FinalizedWithdrawalTx.
@gemini-code-assist
Copy link

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the robustness of the withdrawal process by addressing potential data inconsistencies during system restarts. It ensures that critical transaction information is durably stored before any state changes occur, thereby preventing deposits from becoming unrecoverable. The improved recovery logic can now intelligently reconstruct withdrawal clusters, making the system more resilient to unexpected interruptions.

Highlights

  • Robust Withdrawal Recovery: Implemented a mechanism to persist finalized withdrawal transaction pointers to all selected deposits before triggering state transitions, preventing restart races where deposits could get stuck in a WITHDRAWING state without a recoverable transaction reference.
  • Enhanced Recovery Logic: Improved the recovery process to group withdrawing deposits by transaction hash and restore missing transaction pointers when there is exactly one in-flight withdrawal cluster, ensuring subsequent restarts remain stable.
  • New Regression Test: Added a comprehensive regression test that exercises the recoverWithdrawals function end-to-end, specifically verifying the restart recovery for WITHDRAWING deposits that initially lack a FinalizedWithdrawalTx.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the withdrawal manager to improve recovery logic for deposits that are in the WITHDRAWING state but might be missing their FinalizedWithdrawalTx pointers after a restart. This includes extracting deposit grouping and missing transaction recovery logic into a new helper function, groupWithdrawingDepositsByTx, and centralizing the persistence of finalized withdrawal transactions into persistFinalizedWithdrawalTx. A new test case was added to validate this recovery mechanism. The review comments suggest optimizing the iteration logic within both groupWithdrawingDepositsByTx and persistFinalizedWithdrawalTx to enhance efficiency and readability by combining multiple loops into single passes.

Comment on lines +304 to +362
depositsByWithdrawalTx := make(map[chainhash.Hash][]*deposit.Deposit)
hash2tx := make(map[chainhash.Hash]*wire.MsgTx)

// Build an index of all known finalized withdrawal transactions.
for _, d := range withdrawingDeposits {
if d.FinalizedWithdrawalTx == nil {
continue
}

txid := d.FinalizedWithdrawalTx.TxHash()
hash2tx[txid] = d.FinalizedWithdrawalTx
}

// If exactly one tx hash is present, we can recover missing tx pointers
// from that single cluster.
var fallbackTx *wire.MsgTx
if len(hash2tx) == 1 {
for _, tx := range hash2tx {
fallbackTx = tx
}
}

for _, d := range withdrawingDeposits {
withdrawalTx := d.FinalizedWithdrawalTx
if withdrawalTx == nil {
if fallbackTx == nil {
log.Warnf("Skipping withdrawing deposit %v "+
"during recovery: missing finalized "+
"withdrawal tx", d.OutPoint)

continue
}

// Persist the recovered tx pointer so future restarts
// don't depend on in-memory fallback recovery.
d.Lock()
d.FinalizedWithdrawalTx = fallbackTx
d.Unlock()

err := m.cfg.DepositManager.UpdateDeposit(ctx, d)
if err != nil {
return nil, nil, fmt.Errorf("unable to "+
"persist recovered finalized "+
"withdrawal tx for deposit %v: %w",
d.OutPoint, err)
}

log.Warnf("Recovered missing finalized withdrawal tx "+
"for deposit %v", d.OutPoint)

withdrawalTx = fallbackTx
}

txid := withdrawalTx.TxHash()
hash2tx[txid] = withdrawalTx
depositsByWithdrawalTx[txid] = append(
depositsByWithdrawalTx[txid], d,
)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current implementation iterates over withdrawingDeposits twice. This can be made more efficient and arguably clearer by partitioning the deposits into those with and without a finalized transaction in a single pass. This avoids redundant work and simplifies the logic.

depositsByWithdrawalTx := make(map[chainhash.Hash][]*deposit.Deposit)
	hash2tx := make(map[chainhash.Hash]*wire.MsgTx)
	var missingTxDeposits []*deposit.Deposit

	// Partition deposits into those with and without a finalized tx, and
	// group the ones that have a tx.
	for _, d := range withdrawingDeposits {
		if d.FinalizedWithdrawalTx == nil {
			missingTxDeposits = append(missingTxDeposits, d)
			continue
		}

		txid := d.FinalizedWithdrawalTx.TxHash()
		hash2tx[txid] = d.FinalizedWithdrawalTx
		depositsByWithdrawalTx[txid] = append(
			depositsByWithdrawalTx[txid], d,
		)
	}

	// If there are deposits with missing tx pointers and there is exactly
	// one cluster of deposits with a known tx, we can recover.
	if len(missingTxDeposits) > 0 && len(hash2tx) == 1 {
		var fallbackTx *wire.MsgTx
		var fallbackTxID chainhash.Hash
		for txid, tx := range hash2tx {
			fallbackTx = tx
			fallbackTxID = txid
			break
		}

		for _, d := range missingTxDeposits {
			// Persist the recovered tx pointer so future restarts
			// don't depend on in-memory fallback recovery.
			d.Lock()
			d.FinalizedWithdrawalTx = fallbackTx
			d.Unlock()

			err := m.cfg.DepositManager.UpdateDeposit(ctx, d)
			if err != nil {
				return nil, nil, fmt.Errorf("unable to "+
					"persist recovered finalized "+
					"withdrawal tx for deposit %v: %w",
					d.OutPoint, err)
			}

			log.Warnf("Recovered missing finalized withdrawal tx "+
				"for deposit %v", d.OutPoint)

			// Add the recovered deposit to the cluster.
			depositsByWithdrawalTx[fallbackTxID] = append(
				depositsByWithdrawalTx[fallbackTxID], d,
			)
		}
	} else if len(missingTxDeposits) > 0 {
		// Log deposits that could not be recovered.
		for _, d := range missingTxDeposits {
			log.Warnf("Skipping withdrawing deposit %v "+
				"during recovery: missing finalized "+
				"withdrawal tx", d.OutPoint)
		}
	}

Comment on lines +372 to +384
for _, d := range deposits {
d.Lock()
d.FinalizedWithdrawalTx = finalizedTx
d.Unlock()
}

for _, d := range deposits {
err := m.cfg.DepositManager.UpdateDeposit(ctx, d)
if err != nil {
return fmt.Errorf("failed to update deposit %v: %w",
d.OutPoint, err)
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The function persistFinalizedWithdrawalTx currently iterates through the deposits slice twice: once to update the FinalizedWithdrawalTx in memory, and a second time to persist each deposit to the database. These two loops can be combined into a single loop for better efficiency and readability. In each iteration, you can update the in-memory object and then immediately call UpdateDeposit. This also improves consistency in case of a failure during persistence.

	for _, d := range deposits {
		d.Lock()
		d.FinalizedWithdrawalTx = finalizedTx
		d.Unlock()

		err := m.cfg.DepositManager.UpdateDeposit(ctx, d)
		if err != nil {
			return fmt.Errorf("failed to update deposit %v: %w",
				d.OutPoint, err)
		}
	}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant