Enable transitioning of node to using MEL when previously ran with Inbox Reader-Tracker by ganeshvanahalli · Pull Request #4597 · OffchainLabs/nitro

ganeshvanahalli · 2026-04-02T21:26:42Z

This PR enables a nitro node to transition to using message extraction code in place of the current inbox reader and tracker code. MEL is considered to be consensus active if the state's version > 0, hence a node can turn MEL ON with version = 0 and use it just for processing without having to use it for consensus, validation etc...

When MEL is first enabled on the node, we first detect an appropriate finalized parent chain block from whom a batch has been READ, we then appropriately fill in the MEL state fields and create an Initial MEL state. From here on, the node will use MEL for processing parent chain blocks and the node will not be allowed to restart the old way i.e using inbox reader and tracker code.

Testing Done

I successfully ran an arb-sepolia node from scratch upto ~300k blocks using inbox reader and tracker code - shut it down - and successfully restarted with MEL turned on and saw the node continue catchup in an expected way.

Resolves NIT-4209

…box Reader-Tracker

…de-transition-to-mel

codecov · 2026-04-02T21:50:15Z

Codecov Report

❌ Patch coverage is 8.80282% with 259 lines in your changes missing coverage. Please review.
✅ Project coverage is 33.61%. Comparing base (7e71bec) to head (a4430e7).
⚠️ Report is 21 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #4597      +/-   ##
==========================================
- Coverage   33.95%   33.61%   -0.34%     
==========================================
  Files         498      499       +1     
  Lines       59889    60120     +231     
==========================================
- Hits        20333    20211     -122     
- Misses      36014    36365     +351     
- Partials     3542     3544       +2

github-actions · 2026-04-02T22:24:32Z

❌ 15 Tests Failed:

Tests completed	Failed	Passed	Skipped
4924	15	4909	0

View the top 3 failed tests by shortest run time

TestEndToEnd_ManyEvilValidators

Stack Traces | -0.000s run time

... [CONTENT TRUNCATED: Keeping last 20 lines]
created by github.com/offchainlabs/nitro/bold/containers/events.(*Producer[...]).Broadcast in goroutine 203082
	/home/runner/work/nitro/nitro/bold/containers/events/producer.go:106 +0x10a

goroutine 5534630 [select]:
github.com/offchainlabs/nitro/bold/containers/events.(*Producer[...]).Broadcast.func1()
	/home/runner/work/nitro/nitro/bold/containers/events/producer.go:107 +0xd9
created by github.com/offchainlabs/nitro/bold/containers/events.(*Producer[...]).Broadcast in goroutine 203079
	/home/runner/work/nitro/nitro/bold/containers/events/producer.go:106 +0x10a

goroutine 5534639 [select]:
github.com/offchainlabs/nitro/bold/containers/events.(*Producer[...]).Broadcast.func1()
	/home/runner/work/nitro/nitro/bold/containers/events/producer.go:107 +0xd9
created by github.com/offchainlabs/nitro/bold/containers/events.(*Producer[...]).Broadcast in goroutine 203079
	/home/runner/work/nitro/nitro/bold/containers/events/producer.go:106 +0x10a

goroutine 5534629 [select]:
github.com/offchainlabs/nitro/bold/containers/events.(*Producer[...]).Broadcast.func1()
	/home/runner/work/nitro/nitro/bold/containers/events/producer.go:107 +0xd9
created by github.com/offchainlabs/nitro/bold/containers/events.(*Producer[...]).Broadcast in goroutine 203073
	/home/runner/work/nitro/nitro/bold/containers/events/producer.go:106 +0x10a

TestEndToEnd_ManyEvilValidators/honest_essential_edges_confirmed_by_challenge_win

Stack Traces | -0.000s run time

... [CONTENT TRUNCATED: Keeping last 20 lines]
created by github.com/offchainlabs/nitro/bold/containers/events.(*Producer[...]).Broadcast in goroutine 203082
	/home/runner/work/nitro/nitro/bold/containers/events/producer.go:106 +0x10a

goroutine 5534630 [select]:
github.com/offchainlabs/nitro/bold/containers/events.(*Producer[...]).Broadcast.func1()
	/home/runner/work/nitro/nitro/bold/containers/events/producer.go:107 +0xd9
created by github.com/offchainlabs/nitro/bold/containers/events.(*Producer[...]).Broadcast in goroutine 203079
	/home/runner/work/nitro/nitro/bold/containers/events/producer.go:106 +0x10a

goroutine 5534639 [select]:
github.com/offchainlabs/nitro/bold/containers/events.(*Producer[...]).Broadcast.func1()
	/home/runner/work/nitro/nitro/bold/containers/events/producer.go:107 +0xd9
created by github.com/offchainlabs/nitro/bold/containers/events.(*Producer[...]).Broadcast in goroutine 203079
	/home/runner/work/nitro/nitro/bold/containers/events/producer.go:106 +0x10a

goroutine 5534629 [select]:
github.com/offchainlabs/nitro/bold/containers/events.(*Producer[...]).Broadcast.func1()
	/home/runner/work/nitro/nitro/bold/containers/events/producer.go:107 +0xd9
created by github.com/offchainlabs/nitro/bold/containers/events.(*Producer[...]).Broadcast in goroutine 203073
	/home/runner/work/nitro/nitro/bold/containers/events/producer.go:106 +0x10a

TestRedisProduceComplex/one_producer,_all_consumers_are_active

Stack Traces | 1.350s run time

... [CONTENT TRUNCATED: Keeping last 20 lines]
�[36mDEBUG�[0m[04-14|18:28:20.880] consumer: xack                           �[36mcid�[0m=81b71789-589a-496c-b2ea-a94dfdcf0785 �[36mmessageId�[0m=1776191299766-7
�[36mDEBUG�[0m[04-14|18:28:20.880] consumer: xack                           �[36mcid�[0m=6637f0ff-c089-4efa-b76f-dfec056c2777 �[36mmessageId�[0m=1776191299766-10
�[36mDEBUG�[0m[04-14|18:28:20.880] Redis stream consuming                   �[36mconsumer_id�[0m=1e6cc4c5-cc4e-434b-b027-26fb9eaf0a3c �[36mmessage_id�[0m=1776191299767-0
�[36mDEBUG�[0m[04-14|18:28:20.880] consumer: xdel                           �[36mcid�[0m=7e849589-cc36-491a-a2a4-8e60dd09f86e �[36mmessageId�[0m=1776191299766-5
�[36mDEBUG�[0m[04-14|18:28:20.880] consumer: setting result                 �[36mcid�[0m=1e6cc4c5-cc4e-434b-b027-26fb9eaf0a3c �[36mmsgIdInStream�[0m=1776191299767-0  �[36mresultKeyInRedis�[0m=result-key:stream:c54d3fc3-e015-4898-9ba0-3c2d61d83314.1776191299767-0
�[36mDEBUG�[0m[04-14|18:28:20.880] consumer: xdel                           �[36mcid�[0m=219be019-5ee7-454c-ae27-8fcda0472e1b �[36mmessageId�[0m=1776191299766-6
�[36mDEBUG�[0m[04-14|18:28:20.880] consumer: xdel                           �[36mcid�[0m=c6119c94-e802-484b-89fa-abb56aeb28aa �[36mmessageId�[0m=1776191299766-9
�[36mDEBUG�[0m[04-14|18:28:20.880] consumer: xdel                           �[36mcid�[0m=6637f0ff-c089-4efa-b76f-dfec056c2777 �[36mmessageId�[0m=1776191299766-10
�[36mDEBUG�[0m[04-14|18:28:20.880] consumer: xdel                           �[36mcid�[0m=81b71789-589a-496c-b2ea-a94dfdcf0785 �[36mmessageId�[0m=1776191299766-7
�[36mDEBUG�[0m[04-14|18:28:20.880] consumer: xack                           �[36mcid�[0m=1e6cc4c5-cc4e-434b-b027-26fb9eaf0a3c �[36mmessageId�[0m=1776191299767-0
�[36mDEBUG�[0m[04-14|18:28:20.880] consumer: xdel                           �[36mcid�[0m=1e6cc4c5-cc4e-434b-b027-26fb9eaf0a3c �[36mmessageId�[0m=1776191299767-0
�[36mDEBUG�[0m[04-14|18:28:20.881] consumer: xdel                           �[36mcid�[0m=08ec4646-b38e-4d29-adcb-956b060b8869 �[36mmessageId�[0m=1776191299764-1
�[36mDEBUG�[0m[04-14|18:28:20.888] consumer: xack                           �[36mcid�[0m=8addd35e-d804-4a75-a70a-0b25a867d521 �[36mmessageId�[0m=1776191299766-8
�[36mDEBUG�[0m[04-14|18:28:20.888] consumer: xdel                           �[36mcid�[0m=8addd35e-d804-4a75-a70a-0b25a867d521 �[36mmessageId�[0m=1776191299766-8
�[36mDEBUG�[0m[04-14|18:28:21.001] checkResponses                           �[36mresponded�[0m=92 �[36merrored�[0m=0 �[36mchecked�[0m=100
�[36mDEBUG�[0m[04-14|18:28:21.026] redis producer: check responses starting
�[36mDEBUG�[0m[04-14|18:28:21.035] checkResponses                           �[36mresponded�[0m=8  �[36merrored�[0m=0 �[36mchecked�[0m=8
�[31mERROR�[0m[04-14|18:28:21.035] Error from XpendingExt in getting PEL for auto claim �[31merr�[0m="context canceled" �[31mpendingLen�[0m=0
�[36mDEBUG�[0m[04-14|18:28:21.087] Error destroying a stream group          �[36merror�[0m="dial tcp 127.0.0.1:36057: connect: connection refused"
--- FAIL: TestRedisProduceComplex/one_producer,_all_consumers_are_active (1.35s)

📣 Thoughts on this report? Let Codecov know! | Powered by Codecov

rauljordan · 2026-04-03T20:14:42Z

+//
+// Batch count, msg count, and delayed read are derived from legacy batch metadata.
+// The MEL inbox accumulator is reconstructed for any unread delayed messages.
+func CreateInitialMELStateFromLegacyDB(


This is a really nice function! Great job on putting it together

…de-transition-to-mel

The base branch was changed.

eljobe

Overall this is a well-structured migration — the legacy DB read layer is solid, the CreateInitialMELStateFromLegacyDB state construction is correct, the test coverage (especially legacy_db_reads_test.go and TestMELMigrationFromLegacyNode) is thorough, and the IsBatchGasFieldsMissing optimization is a nice touch.

Two issues I'd like to see addressed before merge:

SaveInitialMelState sets in-memory cache before dbBatch.Write() — if the batch write fails, the Database struct is left in an inconsistent state where it routes reads to legacy keys despite nothing being persisted. (See inline comment.)
legacyGetDelayedMessageAndParentChainBlockNumber swallows non-not-found errors from the RLP prefix — if the "e" prefix key exists but is corrupted, the error is silently swallowed and the code falls back to the "d" prefix instead of surfacing the corruption. Notably, the same function does check !rawdb.IsDbErrNotFound(err) for the parent chain block number fallback a few lines later, so this asymmetry looks like an oversight. (See inline comment.)

eljobe · 2026-04-14T09:22:38Z

+	d.hasInitialState = true
+	d.initialDelayedCount = initialState.DelayedMessagesSeen
+	d.initialBatchCount = initialState.BatchCount
+	return dbBatch.Write()


The in-memory fields are set before dbBatch.Write(). If the batch write fails (disk full, I/O error, etc.), the Database struct will have hasInitialState = true with the boundary values cached, but the DB will not actually contain the initial MEL state or the InitialMelStateBlockNumKey. This means fetchBatchMetadata and FetchDelayedMessage will dispatch to legacy keys based on stale thresholds for the remainder of the process lifetime.

In practice the caller returns an error and the node fails to start, so this is unlikely to cause data corruption — but it's a correctness bug in the function's contract and straightforward to fix:

if err := dbBatch.Write(); err != nil { return err } d.hasInitialState = true d.initialDelayedCount = initialState.DelayedMessagesSeen d.initialBatchCount = initialState.BatchCount return nil

addressed it

eljobe · 2026-04-14T09:22:39Z

+	msg, _, err := legacyGetDelayedMessageFromRlpPrefix(db, index)
+	if err != nil {
+		// Fall back to legacy "d" prefix
+		msg, _, err = legacyGetDelayedMessageFromLegacyPrefix(db, index)


When legacyGetDelayedMessageFromRlpPrefix fails for a reason other than "not found" (e.g., corrupted data causing an RLP decode error, truncated entry), the code unconditionally falls through to the legacy "d" prefix instead of surfacing the corruption error. This could silently mask data corruption.

Notably, the same function already checks !rawdb.IsDbErrNotFound(err) for the parent chain block number fallback on line 61, so the asymmetry here looks like an oversight.

Suggested fix — only fall back on not-found:

msg, _, err := legacyGetDelayedMessageFromRlpPrefix(db, index) if err != nil { if !rawdb.IsDbErrNotFound(err) { return nil, 0, fmt.Errorf("error reading RLP delayed message at index %d: %%w", index, err) } // Key genuinely absent — fall back to legacy "d" prefix msg, _, err = legacyGetDelayedMessageFromLegacyPrefix(db, index) ... }

Good catch, fixed it

ganeshvanahalli and others added 3 commits April 2, 2026 16:26

Enable transitioning of node to using MEL when previously ran with In…

0e18b1f

…box Reader-Tracker

add changelog

701d16d

Merge branch 'replace-readerandtracker-with-messageextractor' into no…

76ce27d

…de-transition-to-mel

ganeshvanahalli marked this pull request as ready for review April 2, 2026 21:41

ganeshvanahalli requested a review from rauljordan April 2, 2026 21:41

joshuacolvin0 mentioned this pull request Apr 3, 2026

Fix flaky pathdb test failures caused by MEL integration #4576

Closed

joshuacolvin0 assigned rauljordan Apr 3, 2026

rauljordan reviewed Apr 3, 2026

View reviewed changes

rauljordan previously approved these changes Apr 3, 2026

View reviewed changes

ganeshvanahalli and others added 9 commits April 3, 2026 15:45

add system test to verify transition works

20ee29c

Merge branch 'replace-readerandtracker-with-messageextractor' into no…

e876e96

…de-transition-to-mel

Merge branch 'replace-readerandtracker-with-messageextractor' into no…

081546a

…de-transition-to-mel

fix lint error

c6ca5f3

Merge branch 'replace-readerandtracker-with-messageextractor' into no…

102d632

…de-transition-to-mel

merge upstream and resolve conflicts

d03fbca

add unit tests

ac6f44c

Merge branch 'replace-readerandtracker-with-messageextractor' into no…

1e8152f

…de-transition-to-mel

fix lint errors

b784949

Base automatically changed from replace-readerandtracker-with-messageextractor to master April 10, 2026 11:13

Merge branch 'master' into node-transition-to-mel

7fc981e

ganeshvanahalli assigned eljobe and unassigned rauljordan Apr 13, 2026

fix deadlock in execution engine

50557e9

eljobe requested changes Apr 14, 2026

View reviewed changes

eljobe assigned ganeshvanahalli and unassigned eljobe Apr 14, 2026

address PR comments

5751070

ganeshvanahalli removed their assignment Apr 14, 2026

ganeshvanahalli requested a review from eljobe April 14, 2026 09:34

eljobe previously approved these changes Apr 14, 2026

View reviewed changes

eljobe added the design-approved label Apr 14, 2026

eljobe enabled auto-merge April 14, 2026 09:37

eljobe added this pull request to the merge queue Apr 14, 2026

ganeshvanahalli removed this pull request from the merge queue due to a manual request Apr 14, 2026

ganeshvanahalli and others added 2 commits April 14, 2026 16:28

Merge branch 'master' into node-transition-to-mel

49dcf2a

remove duplicate helper method declaration in commmon_test.go

cfbcb70

ganeshvanahalli dismissed eljobe’s stale review via cfbcb70 April 14, 2026 10:59

ganeshvanahalli enabled auto-merge April 14, 2026 11:00

ganeshvanahalli requested review from eljobe and rauljordan April 14, 2026 11:22

eljobe previously approved these changes Apr 14, 2026

View reviewed changes

ganeshvanahalli added this pull request to the merge queue Apr 14, 2026

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 14, 2026

Merge branch 'master' into node-transition-to-mel

d16f4d7

ganeshvanahalli enabled auto-merge April 14, 2026 13:27

ganeshvanahalli added this pull request to the merge queue Apr 14, 2026

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 14, 2026

ganeshvanahalli added this pull request to the merge queue Apr 14, 2026

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 14, 2026

improve TestMELMigrationFromLegacyNode

762d6b2

ganeshvanahalli dismissed eljobe’s stale review via 762d6b2 April 14, 2026 16:28

bug fix

a4430e7

rauljordan approved these changes Apr 14, 2026

View reviewed changes

rauljordan added this pull request to the merge queue Apr 14, 2026

Merged via the queue into master with commit c28c984 Apr 14, 2026
23 of 24 checks passed

rauljordan deleted the node-transition-to-mel branch April 14, 2026 19:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable transitioning of node to using MEL when previously ran with Inbox Reader-Tracker#4597

Enable transitioning of node to using MEL when previously ran with Inbox Reader-Tracker#4597
rauljordan merged 20 commits intomasterfrom
node-transition-to-mel

ganeshvanahalli commented Apr 2, 2026 •

edited

Loading

Uh oh!

codecov bot commented Apr 2, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Apr 2, 2026 •

edited

Loading

Uh oh!

rauljordan Apr 3, 2026

Uh oh!

ganeshvanahalli Apr 6, 2026

Uh oh!

eljobe left a comment

Uh oh!

eljobe Apr 14, 2026

Uh oh!

ganeshvanahalli Apr 14, 2026

Uh oh!

eljobe Apr 14, 2026

Uh oh!

ganeshvanahalli Apr 14, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ganeshvanahalli commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Testing Done

Uh oh!

codecov bot commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions bot commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

❌ 15 Tests Failed:

Uh oh!

rauljordan Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

ganeshvanahalli Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

eljobe left a comment

Choose a reason for hiding this comment

Uh oh!

eljobe Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

ganeshvanahalli Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

eljobe Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

ganeshvanahalli Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ganeshvanahalli commented Apr 2, 2026 •

edited

Loading

codecov bot commented Apr 2, 2026 •

edited

Loading

github-actions bot commented Apr 2, 2026 •

edited

Loading