wip(benchmark): create simpler version of benchmark runner by meyer9 · Pull Request #2514 · base/base

meyer9 · 2026-05-04T20:27:08Z

The web interface can stay in base/benchmark and the benchmark runner will live here eventually. This removes a lot of the old dependencies on Geth and simplifies the tool significantly.

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

…xpansion Adds the base-benchmark crate with: - BenchmarkConfig YAML types with full matrix variable expansion (cartesian product, ≤100 run guard) - BenchmarkError unified error enum - src/params.rs with rollup consts (keys, gas limits, EIP-1559 params) - Axum/tokio CLI binary stub with all flags and BASE_BENCH_ env prefix - Output helpers (random_id, gzip_file, write_result_json, dump_log_tail) - 8 unit tests (config round-trip, matrix expansion, gzip round-trip) - flate2, nix, prometheus-parse added to workspace dependencies Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

…nager Infrastructure utilities for child process orchestration: - PortManager: TcpListener::bind probe over 10000-65535 with mutex-guarded HashSet, acquire/release/acquire_n - ProcessHandle: tokio::process::Command with SIGINT via nix, 5s grace period, SIGKILL escalation, structured tracing - SnapshotManager: sha256(command)[:12] cache key, force_clean, explicit datadir passthrough, external script invocation - 20 tests passing (up from 8) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

… BuilderClient Client abstraction layer for EL process orchestration: - ExecutionClient trait (run/stop/rpc_url/auth_rpc_url/metrics_port/get_version/set_head/flashblocks_client/supports_flashblocks) - BaseRethNodeClient: 4 ports (EL/Auth/Metrics/P2P), full reth arg list, txpool backup cleanup, 240s RPC readiness poll, SIGINT stop - BuilderClient: wraps BaseRethNodeClient, acquires flashblocks WS port, appends --flashblocks.port/block-time/rollup.chain-block-time - setup_node() dispatch fn mapping node_type string to boxed trait object - FlashblocksClient stub in src/flashblocks/ (full WS impl in Phase 7) - 25 tests passing (up from 20) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

Implements BaseConsensusClient (wraps BaseEngineClient with FCU/getPayload/newPayload), SequencerConsensusClient (proposes blocks with fake L1BlockInfo deposit tx and Holocene EIP-1559 params), SyncingConsensusClient (replays payloads for validator sync benchmarking), FakeMempool (deposit-txs-first ordering), BlockMetrics (Prometheus delta scraping, Histogram/Gauge/Counter handling), MetricsCollector, write_metrics_json, check_thresholds, and all metric name consts. 37/37 tests passing.

Adds axum-based JSON-RPC intercept proxy (eth_sendRawTransaction → FakeMempool, everything else forwarded to upstream), PayloadWorker trait, and LoadTestPayloadWorker that writes a temp YAML config and spawns base-load-test as a subprocess with FUNDER_KEY env and duration=99999s. 41/41 tests passing.

cb-heimdall · 2026-05-04T20:27:12Z

🟡 Heimdall Review Status

Requirement Status More Info

Reviews

🟡 0/1

Denominator calculation

Show calculation

1 if user is bot 0

1 if user is external 0

2 if repo is sensitive 0

From .codeflow.yml 1

Additional review requirements

Show calculation

Max	0

0

From CODEOWNERS 0

Global minimum 0

Max 1

1

1 if commit is unverified 1

Sum 2

vercel · 2026-05-04T20:27:13Z

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment

Project	Deployment	Actions	Updated (UTC)
base	Ignored	Preview	May 5, 2026 2:37pm

Implements FlashblocksClient (tokio-tungstenite WS consumer with auto-reconnect, exponential backoff, and Flashblock::try_decode_message for JSON/Brotli messages) and FlashblockReplayServer (axum WS broadcast server replaying stored flashblock payloads to benchmarked validator clients). 43/43 tests passing.

github-actions · 2026-05-04T20:31:03Z

+        let fb_block_time = options.flashblocks_block_time_ms.unwrap_or(block_time_ms / 4);
+        let inner = BaseRethNodeClient::new(options, internal, port_manager, None);


BuilderClient creates a BaseRethNodeClient internally, which uses self.options.reth_bin as the binary (via self.binary()). The builder_bin field in ClientOptions is never read — so the builder client runs the reth binary instead of the builder binary.

You likely need to swap reth_bin for builder_bin here (or override the binary in the inner client):

Suggested change

let fb_block_time = options.flashblocks_block_time_ms.unwrap_or(block_time_ms / 4);

let inner = BaseRethNodeClient::new(options, internal, port_manager, None);

let mut options_for_inner = options.clone();

options_for_inner.reth_bin = options.builder_bin.clone();

let inner = BaseRethNodeClient::new(options_for_inner, internal, port_manager, None);

github-actions · 2026-05-04T20:31:04Z

+    let req: JsonRpcRequest = match serde_json::from_slice(&body) {
+        Ok(r) => r,
+        Err(_) => {
+            return proxy_raw(&state, &headers, body).await.into_response();
+        }
+    };


This only handles single JSON-RPC requests. If the load-test binary sends batch requests (JSON arrays), from_slice::<JsonRpcRequest> will fail and the entire batch gets forwarded to the upstream via proxy_raw — bypassing the mempool interception.

Consider also trying to parse as Vec<JsonRpcRequest> in the error branch, extracting any eth_sendRawTransaction calls from the batch.

github-actions · 2026-05-04T20:31:06Z

+fn histogram_sum_count(buckets: &[prometheus_parse::HistogramCount]) -> (f64, f64) {
+    let sum = buckets.iter().filter(|b| b.less_than == f64::INFINITY).map(|b| b.count).sum();
+    let count = buckets.last().map(|b| b.count).unwrap_or(0.0);
+    (sum, count)


This function looks incorrect for extracting histogram sum and count. In the prometheus_parse crate, HistogramCount represents bucket entries with less_than (the bucket boundary) and count (the cumulative count up to that boundary). The histogram _sum and _count are separate parsed entries — _sum is typically not accessible from the bucket list.

The current code:

sum: filters for less_than == +Inf and sums their count — but +Inf bucket's count is the total sample count, not the sum of observed values

count: takes the last bucket's count, which happens to be the same total count

So both sum and count end up returning the total count, and the delta computation becomes (count_delta) / (count_delta) == 1.0 always.

You likely need to extract _sum and _count from separate Sample entries rather than from bucket data.

github-actions · 2026-05-04T20:31:07Z

+fn fake_beacon_root() -> B256 {
+    use alloy_primitives::keccak256;
+    keccak256(FAKE_BEACON_ROOT_PREIMAGE)
+}


Per CLAUDE.md: "All use imports must be at the top of the file or the top of a mod block. Never place use statements inside function bodies." This use alloy_primitives::keccak256 should be moved to the top of the file alongside the other alloy imports.

github-actions · 2026-05-04T20:31:09Z

+/// Required in every payload's transaction list. Since the benchmark has no
+/// real L1 chain, all L1 fields are zero.
+fn build_l1_info_deposit_tx() -> Bytes {
+    use alloy_rlp::Encodable;


Same issue — use alloy_rlp::Encodable should be at the top of the file, not inside a function body.

github-actions · 2026-05-04T20:31:10Z

+pub const BATCHER_KEY: B256 =
+    b256!("d2ba8e70072983384203c438d4e94bf399cbd88bbcafb82b61cc96ed12541707");
+
+pub const PREFUND_KEY: B256 =
+    b256!("ad0974bec39a17e36ba4a6b4d238ff944bacb478cbed5efcae784d7bf4f2ff80");


These keys don't match the values in PORTING_REFERENCE.md which specifies:

Batcher key: 0xac0974bec39a17e36ba4a6b4d238ff944bacb478cbed5efcae784d7bf4f2ff80

Prefund key: 0x59c6995e998f97a5a0044966f0945389dc9e86dae88c7a8412f4603b6b78690d

If these were intentionally changed, the porting reference should be updated accordingly. If not, these need to match the Go source so benchmarks produce comparable results.

Adds NetworkBenchmark (run_all/run_one loop), RunnerOptions (binary paths), and full end-to-end orchestration: tempdir setup, snapshot preparation, node lifecycle, proxy + load-test subprocess, SequencerConsensusClient block loop, MetricsCollector, threshold violation checks, and per-run JSON result output. 44/44 tests passing.

Adds service::run_benchmark (reads YAML config, constructs NetworkBenchmark, reports violations, fails on error-severity threshold breaches) and wires src/bin/base_bench.rs to call it with PREFUND_KEY env support. 45/45 tests.

Adds examples/devnet.yaml (devnet throughput run with erc20-transfer payload, warning/error metric thresholds) and rewrites README with architecture diagram, usage instructions, config field reference, and output description.

Mark all phases complete (1-10), check off all task items, update module layout to match actual files built (proxy/, no rollup.rs, service.rs role), and condense verbose task descriptions to what was actually implemented.

… run - BuilderClient: use builder_bin instead of reth_bin (was launching wrong binary) - MetricsCollector: use node.metrics_port() instead of dead pre-acquired port - Remove unused metrics_port acquisition from runner - genesis_json_from_rollup_config: generate EL genesis.json from rollup config so fork timestamps (canyonTime, ecotoneTime, etc.) are properly picked up by reth chain spec loading -- without this, no hardforks were activated and FCU V3 was rejected as 'withdrawals pre-Shanghai' - BaseConsensusClient: init_from_genesis() seeds head hash/number/timestamp from the node's actual genesis block via eth_getBlockByNumber - BaseConsensusClient: pass real RollupConfig to engine client (not Default) - Engine API: use FCU V3 + getPayload V4 + newPayload V4 (latest supported) - PayloadAttributes: withdrawals=Some([]) required post-Canyon; None caused -38003 - README: tag architecture code block as 'text' to suppress doctest - 45/45 tests passing; benchmark completes 20 blocks, success=true, 0 violations

Write per-run output directories with metadata.json, metrics-sequencer.json, and metrics-validator.json matching the visualization dashboard's expected format. - Metric keys use latency/fork_choice_updated, latency/new_payload, etc. - Per-block metrics in nanoseconds, summary metrics in seconds - gas/per_second uses wall-clock block-building time (sequencer) and newPayload duration (validator) - Validator replay via SyncingConsensusClient collects Prometheus metrics - Runner writes self-contained output dir with sequencer/validator logs - config_path threaded through for sourceFile in metadata - node_args wired from BenchmarkDefinition to extra_args on node clients

…plit Fix a series of bugs preventing user transactions from landing in blocks: - Pre-fund the funder account in genesis alloc using the prefund key address - Set genesis timestamp to now() so FCU timestamps are never in the past - Set FCU timestamp to now+block_time so the builder has time for flashblocks - Add post-newPayload FCU with updated head hash to advance builder canonical chain - Use proper L1BlockInfoEcotone calldata and addresses for the deposit tx so eth_getBlockReceipts does not error on block decoding - Pipe load-test stdout to detect when funding is complete (removes warmup_blocks) - Run unmeasured setup blocks via tokio::select! until load test signals ready, then start the measured test phase - Reset child process signal mask via pre_exec and use process_group(0) to prevent inherited signal state from triggering spurious shutdowns - Move install_signal_handler to after fund_accounts in the load-test binary - Zero out TransferPayload default value (no ETH transfer, just gas consumption) - Add sender_count and funding_amount fields to the load-test YAML config - Add skip_serializing_if on WeightedTx optional fields to avoid YAML parse errors - Add base-protocol, alloy-signer-local, and libc dependencies

github-actions · 2026-05-05T00:23:09Z

+        unsafe {
+            cmd.pre_exec(|| {
+                let mut sigset = std::mem::zeroed::<libc::sigset_t>();
+                libc::sigemptyset(&mut sigset);
+                libc::pthread_sigmask(libc::SIG_SETMASK, &sigset, std::ptr::null_mut());
+                Ok(())
+            });
+        }


Missing // SAFETY: comment on the unsafe block. Document why this is sound — e.g. that pre_exec runs after fork() in the child process, sigemptyset/pthread_sigmask are async-signal-safe, and no Rust invariants are broken because the child hasn't yet executed any Rust code post-fork.

Suggested change

unsafe {

cmd.pre_exec(|| {

let mut sigset = std::mem::zeroed::<libc::sigset_t>();

libc::sigemptyset(&mut sigset);

libc::pthread_sigmask(libc::SIG_SETMASK, &sigset, std::ptr::null_mut());

Ok(())

});

}

// SAFETY: `pre_exec` runs in the forked child process before exec.

// `sigemptyset` and `pthread_sigmask` are async-signal-safe (POSIX),

// so they are safe to call in this context. We clear the signal mask

// so the child process does not inherit blocked signals from the

// parent's tokio runtime.

unsafe {

cmd.pre_exec(|| {

let mut sigset = std::mem::zeroed::<libc::sigset_t>();

libc::sigemptyset(&mut sigset);

libc::pthread_sigmask(libc::SIG_SETMASK, &sigset, std::ptr::null_mut());

Ok(())

});

}

github-actions · 2026-05-05T00:23:10Z

+        let now_secs = std::time::SystemTime::now()
+            .duration_since(std::time::UNIX_EPOCH)
+            .expect("system time after epoch")
+            .as_secs();
+        let block_time_secs = block_time.as_secs().max(1);
+        let next_timestamp = now_secs + block_time_secs;


Bug: The Go source (per PORTING_REFERENCE.md §5) increments the timestamp from the previous block's timestamp by the block time, i.e. self.base.head_block_timestamp + block_time_secs. Using wall-clock time (SystemTime::now()) here means:

Timestamps can drift if block production doesn't match real-time.

If two blocks are proposed quickly (e.g. during setup), the timestamp could be the same or very close, causing the node to reject the payload.

Results are non-reproducible across runs.

Suggested change

let now_secs = std::time::SystemTime::now()

.duration_since(std::time::UNIX_EPOCH)

.expect("system time after epoch")

.as_secs();

let block_time_secs = block_time.as_secs().max(1);

let next_timestamp = now_secs + block_time_secs;

let next_timestamp = self.base.head_block_timestamp + block_time.as_secs().max(1);

github-actions · 2026-05-05T00:23:11Z

+    let alloc = {
+        use alloy_signer_local::PrivateKeySigner;
+        let key = prefund_key.trim_start_matches("0x");
+        let signer = PrivateKeySigner::from_bytes(
+            &alloy_primitives::hex::decode(key)
+                .expect("valid hex prefund key")
+                .as_slice()
+                .try_into()
+                .expect("32-byte private key"),
+        )
+        .expect("valid private key");
+        let addr = format!("{:?}", signer.address());
+        serde_json::json!({
+            addr: {
+                "balance": "0x3635C9ADC5DEA00000000",
+            }
+        })


Three .expect() calls on user-supplied data (prefund_key). If the prefund key is invalid hex or not 32 bytes, this panics the benchmark runner instead of returning a BenchmarkError. Convert to ?-based error handling:

Suggested change

let alloc = {

use alloy_signer_local::PrivateKeySigner;

let key = prefund_key.trim_start_matches("0x");

let signer = PrivateKeySigner::from_bytes(

&alloy_primitives::hex::decode(key)

.expect("valid hex prefund key")

.as_slice()

.try_into()

.expect("32-byte private key"),

)

.expect("valid private key");

let addr = format!("{:?}", signer.address());

serde_json::json!({

addr: {

"balance": "0x3635C9ADC5DEA00000000",

}

})

let alloc = {

use alloy_signer_local::PrivateKeySigner;

let key = prefund_key.trim_start_matches("0x");

let key_bytes = alloy_primitives::hex::decode(key)

.map_err(|e| BenchmarkError::Config(format!("invalid prefund key hex: {e}")))?;

let key_array: [u8; 32] = key_bytes.as_slice()

.try_into()

.map_err(|_| BenchmarkError::Config(format!("prefund key must be 32 bytes, got {}", key_bytes.len())))?;

let signer = PrivateKeySigner::from_bytes(&key_array)

.map_err(|e| BenchmarkError::Config(format!("invalid prefund private key: {e}")))?;

let addr = format!("{:?}", signer.address());

serde_json::json!({

addr: {

"balance": "0x3635C9ADC5DEA00000000",

}

})

};

Also: use alloy_signer_local::PrivateKeySigner is inside a function body. Per CLAUDE.md, use imports must be at the top of the file.

github-actions · 2026-05-05T00:23:12Z

+        if let Err(e) = {
+            use std::io::Seek;
+            file.seek(io::SeekFrom::End(-(max_bytes as i64)))


Two issues:

use inside function body — use std::io::Seek should be at the top of the file, per project conventions.

Potential overflow — -(max_bytes as i64) will produce a wrong value if max_bytes > i64::MAX as u64. While unlikely in practice (this is called with 1MB), a defensive check or using i64::try_from(max_bytes) would be safer.

github-actions · 2026-05-05T00:23:14Z

@@ -0,0 +1,746 @@
+# Benchmark Tool — Rust Port Reference


This file is a byte-for-byte copy of crates/infra/benchmark/PORTING_REFERENCE.md. Having it at the repo root seems unintentional — the benchmark-specific porting reference should live only in the benchmark crate directory. Consider removing this duplicate.

github-actions · 2026-05-05T00:23:15Z

+            node_type: "base-reth-node".to_string(),
+            extra_args: vec![],
+            reth_bin: self.options.reth_bin.clone(),
+            builder_bin: self.options.builder_bin.clone(),


new_payload already sets self.base.head_block_hash, head_block_number, and head_block_timestamp internally (see consensus/mod.rs:197). This manual reassignment on line 259 is redundant — you can remove it.

prometheus_parse emits histogram _sum/_count as separate Counter samples. Add BlockMetrics::compute_delta_averages() which computes (delta_sum / delta_count) relative to the previous scrape and inserts the result as <base>_avg, matching the Go runner's per-interval histogram average logic.

github-actions · 2026-05-05T14:41:49Z

+    async fn batch_send_txs(&self, txs: &[Bytes]) -> Result<(), BenchmarkError> {
+        const BATCH_SIZE: usize = 100;
+        let client = reqwest::Client::new();
+        let mut set = JoinSet::new();
+
+        for batch in txs.chunks(BATCH_SIZE) {
+            let batch: Vec<serde_json::Value> = batch
+                .iter()
+                .map(|tx| {
+                    serde_json::json!({
+                        "jsonrpc": "2.0",
+                        "method": "eth_sendRawTransaction",
+                        "params": [format!("0x{}", hex::encode(tx))],
+                        "id": 1,
+                    })
+                })
+                .collect();
+
+            let url = self.rpc_url.clone();
+            let c = client.clone();
+            set.spawn(async move {
+                match c.post(&url).json(&batch).send().await {
+                    Ok(resp) => {
+                        if let Ok(body) = resp.text().await {
+                            tracing::debug!(response = %body, "batch_send_txs response");
+                        }
+                    }
+                    Err(e) => tracing::warn!(error = %e, "batch_send_txs http error"),
+                }
+            });
+        }
+
+        while set.join_next().await.is_some() {}
+        Ok(())


batch_send_txs silently swallows all HTTP errors — it logs warnings but always returns Ok(()). If the RPC endpoint is down or every batch fails, propose will continue as if transactions were submitted, producing empty blocks without any indication of failure to the caller.

Consider propagating at least one error from the JoinSet, or returning the count of failed batches so the caller can decide whether to continue:

async fn batch_send_txs(&self, txs: &[Bytes]) -> Result<(), BenchmarkError> { const BATCH_SIZE: usize = 100; let client = reqwest::Client::new(); let mut set = JoinSet::new(); for batch in txs.chunks(BATCH_SIZE) { let batch: Vec<serde_json::Value> = batch .iter() .map(|tx| { serde_json::json!({ "jsonrpc": "2.0", "method": "eth_sendRawTransaction", "params": [format!("0x{}", hex::encode(tx))], "id": 1, }) }) .collect(); let url = self.rpc_url.clone(); let c = client.clone(); set.spawn(async move { c.post(&url).json(&batch).send().await.map(|_| ()) }); } let mut last_err = None; while let Some(result) = set.join_next().await { if let Ok(Err(e)) = result { tracing::warn!(error = %e, "batch_send_txs http error"); last_err = Some(e); } } if let Some(e) = last_err { return Err(BenchmarkError::Client(format!("batch tx send failed: {e}"))); } Ok(()) }

github-actions · 2026-05-05T14:41:51Z

 impl Default for TransferPayload {
    fn default() -> Self {
-        Self { min_value: U256::from(1_000u64), max_value: U256::from(100_000u64) }
+        Self { min_value: U256::ZERO, max_value: U256::ZERO }


This changes TransferPayload::default() from sending 1,000–100,000 wei transfers to sending zero-value transfers for all users of the load-test crate, not just the benchmark runner. If any other caller relies on the load-test binary sending non-zero transfers (e.g. to verify balance changes or trigger specific EVM behavior), this silently breaks them.

If zero-value is only needed for benchmarking, consider making this configurable via the YAML config instead of changing the default.

github-actions · 2026-05-05T14:41:53Z

+        let mut ready = std::pin::pin!(worker.wait_until_ready());
+        loop {
+            tokio::select! {
+                biased;
+                result = &mut ready => {
+                    result?;
+                    break;
+                }
+                result = sequencer.propose(&mempool, block_time, gas_limit) => {
+                    result?;
+                }
+            }
+        }


Cancellation safety issue: sequencer.propose() is not cancel-safe. When ready completes, the in-flight propose future is dropped. If propose was mid-way through (e.g., new_payload completed but the subsequent update_fork_choice(None) on line 260 hasn't run yet), the sequencer's internal state (head_block_hash, etc.) will be stale or inconsistent for the next propose call in the measured loop.

Consider running the setup propose to completion before re-checking readiness:

loop { tokio::select! { biased; result = &mut ready => { result?; break; } result = sequencer.propose(&mempool, block_time, gas_limit) => { result?; // Check readiness after each full block rather than mid-propose } } }

Actually, the current structure already does this since propose is a single arm — but biased means ready is polled first, and if ready resolves exactly when propose has partially completed an await point, the propose future is cancelled. A safer pattern would be to poll ready only between blocks:

loop { if worker.is_ready() { break; } sequencer.propose(&mempool, block_time, gas_limit).await?; }

github-actions · 2026-05-05T14:41:55Z

+        use prometheus_parse::Value;
+
+        let sum_keys: Vec<String> = self
+            .execution_metrics
+            .keys()
+            .filter(|k| k.ends_with("_sum"))
+            .cloned()
+            .collect();
+
+        for sum_key in sum_keys {
+            let base = &sum_key[..sum_key.len() - 4];
+            let count_key = format!("{base}_count");
+
+            let cur_sum = match self.execution_metrics.get(&sum_key).copied() {
+                Some(v) => v,
+                None => continue,
+            };
+            let cur_count = match self.execution_metrics.get(&count_key).copied() {
+                Some(v) => v,
+                None => continue,
+            };
+
+            let prev_sum = self
+                .prev_prometheus
+                .get(&sum_key)
+                .and_then(|s| if let Value::Counter(v) | Value::Gauge(v) | Value::Untyped(v) = s.value { Some(v) } else { None })
+                .unwrap_or(0.0);
+            let prev_count = self
+                .prev_prometheus
+                .get(&count_key)
+                .and_then(|s| if let Value::Counter(v) | Value::Gauge(v) | Value::Untyped(v) = s.value { Some(v) } else { None })
+                .unwrap_or(0.0);
+
+            let delta_count = cur_count - prev_count;
+            if delta_count > 0.0 {
+                let avg = (cur_sum - prev_sum) / delta_count;
+                if !avg.is_nan() {
+                    self.execution_metrics.insert(format!("{base}_avg"), avg);
+                }
+            }
+        }
+    }
+
+    /// Update a metric from a new Prometheus sample, computing deltas for
+    /// Histogram and Summary types (deltaSum / deltaCount).
+    ///
+    /// NaN results (e.g. from zero delta count) are silently skipped.
+    pub fn update_prometheus_metric(
+        &mut self,
+        name: &str,
+        current: &prometheus_parse::Sample,
+    ) {
+        use prometheus_parse::Value;


Per project conventions (CLAUDE.md): use prometheus_parse::Value; appears inside function/method bodies at lines 52 and 104. These should be moved to the file-level imports at the top.

github-actions · 2026-05-05T14:41:57Z

+        jwt_secret: alloy_rpc_types_engine::JwtSecret,
+        cfg: Arc<RollupConfig>,
+    ) -> Result<Self, BenchmarkError> {
+        let dummy_l1: Url = "http://127.0.0.1:1".parse().unwrap();


Bare .unwrap() on a hardcoded URL parse. While this will never fail, the project convention is to avoid .unwrap() in non-test code. Use .expect() with a message:

Suggested change

let dummy_l1: Url = "http://127.0.0.1:1".parse().unwrap();

let dummy_l1: Url = "http://127.0.0.1:1".parse().expect("valid hardcoded dummy L1 URL");

github-actions · 2026-05-05T14:41:59Z

    let mut runner = LoadRunner::new(load_config.clone())?;
    runner.set_config_summary(config_summary.clone());

-    // Install signal handler before any long-running work. First signal sets
-    // the stop flag so `run()` exits its loop gracefully and the drain sequence
-    // runs. A second signal force-exits.
    let stop_flag = runner.stop_flag();


The stop_flag binding on line 142 is now unused between its creation and passing to run_test_phases. Previously, install_signal_handler(stop_flag) was called here — now that's moved inside run_test_phases. This leaves a blank line and the stop_flag binding dangling far from its use as a function argument. Minor, but worth cleaning up:

Suggested change

let mut runner = LoadRunner::new(load_config.clone())?;

runner.set_config_summary(config_summary.clone());

// Install signal handler before any long-running work. First signal sets

// the stop flag so `run()` exits its loop gracefully and the drain sequence

// runs. A second signal force-exits.

let stop_flag = runner.stop_flag();

let mut runner = LoadRunner::new(load_config.clone())?;

runner.set_config_summary(config_summary.clone());

let stop_flag = runner.stop_flag();

github-actions · 2026-05-05T14:42:49Z

Review Summary

This PR adds a new base-benchmark crate (~5700 lines) that ports a Go benchmark runner to Rust, plus minor changes to the load-test binary. The overall structure is reasonable — clean module separation, proper error enum with thiserror, and async tokio-based orchestration. However, there are several correctness and safety issues that should be addressed before merge.

Critical Issues

#	File	Issue
1	`metrics/mod.rs:155-158`	`histogram_sum_count` is incorrect — extracts `+Inf` bucket count as "sum" and last bucket count as "count", making both the same value. Delta averages will always be ~1.0. (existing comment)
2	`consensus/mod.rs:231-236`	Wall-clock timestamp instead of incremental — uses `SystemTime::now()` instead of `head_block_timestamp + block_time`, causing non-reproducible results and potential node rejection. (existing comment)
3	`client/mod.rs:309`	Builder runs reth binary — `BuilderClient` uses `reth_bin` instead of `builder_bin` for the inner node client. (existing comment)
4	`consensus/mod.rs:291-324`	`batch_send_txs` swallows all errors — always returns `Ok(())` even if every HTTP request fails, leading to silent empty blocks. (new)
5	`runner/mod.rs:222-234`	Cancel-unsafe `tokio::select!` in setup loop — dropping a partially-completed `propose` future can leave sequencer consensus state inconsistent. (new)

Moderate Issues

#	File	Issue
6	`runner/mod.rs:388-395`	`.expect()` on user-supplied prefund key — panics instead of returning `BenchmarkError`. (existing)
7	`process.rs:63-70`	Missing `// SAFETY:` comment on `unsafe` block. (existing)
8	`proxy/mod.rs:63-68`	Only intercepts single JSON-RPC requests; batch arrays bypass mempool. (existing)
9	`params.rs:17-21`	Batcher/prefund keys don't match `PORTING_REFERENCE.md`. (existing)
10	`runner/mod.rs:259`	Redundant `head_block_hash` reassignment after `new_payload`. (existing)
11	`PORTING_REFERENCE.md`	Duplicate file at repo root. (existing)
12	`load-tests/transfer.rs:31`	Changing `TransferPayload::default()` to zero-value affects all load-test users, not just benchmark. (new)

Convention Issues

#	File	Issue
13	`consensus/mod.rs:26,414-416`	`use` statements inside function bodies. (existing)
14	`metrics/mod.rs:52,104`	`use prometheus_parse::Value` inside method bodies. (new)
15	`output.rs:195`	`use std::io::Seek` inside function body. (existing)
16	`runner/mod.rs:386`	`use alloy_signer_local::PrivateKeySigner` inside function body. (existing)
17	`consensus/mod.rs:99`	Bare `.unwrap()` on hardcoded URL parse — should be `.expect()`. (new)

Notes

The load-test change (moving install_signal_handler after funding) makes sense for benchmark integration — the runner reads "Accounts funded." from stdout to detect readiness.
The params.rs module is missing a //! doc comment, but it's not a mod.rs so this doesn't violate the specific CLAUDE.md rule.
The Cargo.toml dependency ordering doesn't follow the workspace's waterfall style, but this is a new crate so it can be cleaned up separately.

meyer9 and others added 7 commits May 4, 2026 12:49

chore(benchmark): add benchmark porting reference

e4756b7

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

chore(benchmark): add implementation plan for base-benchmark crate

1aa6ded

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

github-actions Bot reviewed May 4, 2026

View reviewed changes

meyer9 added 7 commits May 4, 2026 13:36

benchmark: add Phase 9 service layer and binary wiring

18ac945

Adds service::run_benchmark (reads YAML config, constructs NetworkBenchmark, reports violations, fails on error-severity threshold breaches) and wires src/bin/base_bench.rs to call it with PREFUND_KEY env support. 45/45 tests.

benchmark: add Phase 10 example config and updated README

9d0f80d

Adds examples/devnet.yaml (devnet throughput run with erc20-transfer payload, warning/error metric thresholds) and rewrites README with architecture diagram, usage instructions, config field reference, and output description.

benchmark: update PLAN.md to reflect completed implementation

0dcb883

Mark all phases complete (1-10), check off all task items, update module layout to match actual files built (proxy/, no rollup.rs, service.rs role), and condense verbose task descriptions to what was actually implemented.

github-actions Bot reviewed May 5, 2026

View reviewed changes

		let fb_block_time = options.flashblocks_block_time_ms.unwrap_or(block_time_ms / 4);
		let inner = BaseRethNodeClient::new(options, internal, port_manager, None);

-        let fb_block_time = options.flashblocks_block_time_ms.unwrap_or(block_time_ms / 4);
-        let inner = BaseRethNodeClient::new(options, internal, port_manager, None);
+        let mut options_for_inner = options.clone();
+        options_for_inner.reth_bin = options.builder_bin.clone();
+        let inner = BaseRethNodeClient::new(options_for_inner, internal, port_manager, None);

	let dummy_l1: Url = "http://127.0.0.1:1".parse().unwrap();
	let dummy_l1: Url = "http://127.0.0.1:1".parse().expect("valid hardcoded dummy L1 URL");

Conversation

meyer9 commented May 4, 2026

Uh oh!

cb-heimdall commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🟡 Heimdall Review Status

Uh oh!

vercel Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 5, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 5, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 5, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 5, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 5, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 5, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 5, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 5, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 5, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 5, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 5, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 5, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 5, 2026

Review Summary

Critical Issues

Moderate Issues

Convention Issues

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cb-heimdall commented May 4, 2026 •

edited

Loading

vercel Bot commented May 4, 2026 •

edited

Loading