refactor(tools/da-debug): improve da debug#3199
Conversation
|
The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).
|
|
Claude finished @julienrbrt's task in 1m 22s —— View job Code Review:
|
📝 WalkthroughWalkthroughRefactors the Changes
Sequence Diagram(s)sequenceDiagram
participant CLI as "CLI (main)"
participant Search as "searchForHeight"
participant Decoder as "blob decoder goroutine"
participant Collector as "results collector"
CLI->>Search: start searchForHeight(rootCtx, startHeight, searchRange)
Note right of Search: loop per DA height\n(print progress)
Search->>Decoder: spawn decoder for each blob (concurrent)
Decoder->>Decoder: decode header -> decode data
Decoder->>Collector: send blobResult (match/no-match)
Collector->>Collector: aggregate results (WaitGroup)
alt match found & bounded / unbounded rules met
Collector->>CLI: return FOUND results / stop signal
else no matches and range exhausted (if bounded)
Collector->>CLI: return no-results for DA range
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ❌ 3❌ Failed checks (2 warnings, 1 inconclusive)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3199 +/- ##
=======================================
Coverage 61.14% 61.14%
=======================================
Files 117 117
Lines 12082 12082
=======================================
Hits 7387 7387
- Misses 3868 3871 +3
+ Partials 827 824 -3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tools/da-debug/main.go (1)
166-179:⚠️ Potential issue | 🔴 Critical
continuenow trapssearchon the same DA height.Because
daHeight++is at line 233 (after blob processing), any non-futureerror on lines 176-178 skips the increment and retries the same height forever.queryHeightalready treats"not found"as a normal empty-height result, so the first gap with no blobs will now hang the search instead of advancing.🐛 Move the increment to the loop header and only skip errors you explicitly want to ignore
- daHeight := startHeight - for { + for daHeight := startHeight; ; daHeight++ { if searchRange > 0 && daHeight >= startHeight+searchRange || foundBlobs > 0 { break } fmt.Printf("Checking DA height %d...\r", daHeight) blobs, err := client.Blob.GetAll(ctx, daHeight, []share.Namespace{ns}) if err != nil { + if ctx.Err() != nil { + return fmt.Errorf("search aborted at DA height %d: %w", daHeight, ctx.Err()) + } if strings.Contains(err.Error(), "future") { fmt.Printf("Reached future height at DA height %d\n", daHeight) break } - continue + if strings.Contains(err.Error(), "not found") { + continue + } + return fmt.Errorf("failed to get blobs at DA height %d: %w", daHeight, err) }Per coding guidelines, "Return errors early in Go functions" and "Use context.Context for cancellation in Go code".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tools/da-debug/main.go` around lines 166 - 179, The loop currently leaves daHeight++ after blob processing which causes a non-"future" error from client.Blob.GetAll to hit the continue and retry the same DA height forever; change the loop to increment daHeight in the header (e.g., for daHeight := startHeight; searchRange==0 || daHeight < startHeight+searchRange; daHeight++) or equivalent so the height always advances, and stop using a blind continue on err — only special-case the "future" error (as handled now) and otherwise return or propagate the error from the GetAll call (or log+return) so callers can handle it; update references to daHeight, startHeight, searchRange, foundBlobs, and client.Blob.GetAll accordingly.
🧹 Nitpick comments (1)
tools/da-debug/main.go (1)
182-228: Cap the blob fan-out per height.This allocates
len(blobs)result slots and launcheslen(blobs)goroutines for every height. Because that count comes from the RPC response, a busy height can spike memory and CPU before you ever find a match.♻️ A small semaphore keeps concurrency predictable
+ workers := runtime.GOMAXPROCS(0) - results := make(chan blobResult, len(blobs)) + results := make(chan blobResult, workers) + sem := make(chan struct{}, workers) var wg sync.WaitGroup for _, blob := range blobs { + sem <- struct{}{} wg.Add(1) go func(b any) { defer wg.Done() + defer func() { <-sem }() // existing decode logic... }(blob) }As per coding guidelines, "Be mindful of goroutine leaks in Go code", "Use buffered channels appropriately in Go code", and "Consider memory allocation in hot paths of Go code".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tools/da-debug/main.go` around lines 182 - 228, The current loop launches len(blobs) goroutines and allocates results with len(blobs), risking spikes; add a bounded semaphore (e.g., sem := make(chan struct{}, MaxConcurrency)) and acquire before starting each worker and release (defer) inside the goroutine to cap concurrent goroutines, and change results to a smaller fixed buffer (e.g., capacity MaxConcurrency or a sane constant) instead of len(blobs); keep the existing sync.WaitGroup usage and the close(results) waiter, and keep the decoding logic in the goroutine (tryDecodeHeader, tryDecodeData, targetHeight, blobResult, blobs) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tools/da-debug/main.go`:
- Line 87: The command-wide timeout in runSearch wraps the whole scan so
--range=0 is still limited; change to create a per-fetch context for each
Blob.GetAll call instead of a single context.WithTimeout around the loop: use a
root ctx (context.Background()) and for each daHeight call
context.WithTimeout(rootCtx, timeout) (and defer/call cancel immediately after
the GetAll returns), and treat timeout==0 as “no timeout” by using the root ctx
directly; update references in runSearch, the timeout handling, and the
Blob.GetAll calls so each fetch is independently cancellable and --range=0 truly
searches until found.
---
Outside diff comments:
In `@tools/da-debug/main.go`:
- Around line 166-179: The loop currently leaves daHeight++ after blob
processing which causes a non-"future" error from client.Blob.GetAll to hit the
continue and retry the same DA height forever; change the loop to increment
daHeight in the header (e.g., for daHeight := startHeight; searchRange==0 ||
daHeight < startHeight+searchRange; daHeight++) or equivalent so the height
always advances, and stop using a blind continue on err — only special-case the
"future" error (as handled now) and otherwise return or propagate the error from
the GetAll call (or log+return) so callers can handle it; update references to
daHeight, startHeight, searchRange, foundBlobs, and client.Blob.GetAll
accordingly.
---
Nitpick comments:
In `@tools/da-debug/main.go`:
- Around line 182-228: The current loop launches len(blobs) goroutines and
allocates results with len(blobs), risking spikes; add a bounded semaphore
(e.g., sem := make(chan struct{}, MaxConcurrency)) and acquire before starting
each worker and release (defer) inside the goroutine to cap concurrent
goroutines, and change results to a smaller fixed buffer (e.g., capacity
MaxConcurrency or a sane constant) instead of len(blobs); keep the existing
sync.WaitGroup usage and the close(results) waiter, and keep the decoding logic
in the goroutine (tryDecodeHeader, tryDecodeData, targetHeight, blobResult,
blobs) unchanged.
There was a problem hiding this comment.
🧹 Nitpick comments (2)
tools/da-debug/main.go (2)
197-207: Type assertion viaanyis inconsistent withqueryHeight.The blob is passed as
anyand then type-asserted tointerface{ Data() []byte }. In contrast,queryHeight(line 305) callsblob.Data()directly. This inconsistency could mask type mismatches at runtime.Consider using the concrete type consistently:
Alternative using direct method call
- for _, blob := range blobs { - wg.Add(1) - go func(b any) { - defer wg.Done() - - var blobData []byte - switch v := b.(type) { - case interface{ Data() []byte }: - blobData = v.Data() - default: - results <- blobResult{blobData: blobData, found: false} - return - } + for _, blob := range blobs { + wg.Add(1) + go func(blobData []byte) { + defer wg.Done()And pass
blob.Data()as the argument:- }(blob) + }(blob.Data())🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tools/da-debug/main.go` around lines 197 - 207, The goroutine is accepting b as any and then type-asserting to interface{ Data() []byte }, which is inconsistent with queryHeight that directly calls blob.Data(); fix by making the goroutine accept the concrete type (either the actual blob type or the small interface interface{ Data() []byte }) instead of any and remove the type switch, or alternatively pass blob.Data() into the goroutine directly (so the goroutine param becomes []byte). Update the goroutine signature where wg.Done() and results <- blobResult are used, and ensure callers (including queryHeight) pass the same concrete blob or blob.Data() to keep types consistent with blobResult and avoid runtime assertions.
172-188: Consider logging non-"future" errors and documenting--timeout 0behavior.Two observations:
Line 188: When
GetAllfails with an error other than "future", the error is silently swallowed. For a debug tool, logging these errors would help users diagnose connection or authentication issues.Lines 176-178: When
--timeout 0is explicitly set, there's no timeout protection—calls toGetAllcan hang indefinitely on network issues sinceBlobAPI.GetAllrelies entirely on the caller's context for cancellation.Suggested improvement for error visibility
if err != nil { if strings.Contains(err.Error(), "future") { fmt.Printf("Reached future height at DA height %d\n", daHeight) break } + if verbose { + fmt.Printf("Warning: failed to fetch DA height %d: %v\n", daHeight, err) + } continue }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tools/da-debug/main.go` around lines 172 - 188, The call to client.Blob.GetAll (using fetchCtx/fetchCancel) currently swallows non-"future" errors and treats timeout==0 as an unbounded call; change it to (1) log any non-"future" error (include err, daHeight and ns in the message) instead of silently continuing so users can diagnose failures, and (2) make the --timeout 0 behavior explicit by either enforcing a sane default timeout (e.g., 30s) when timeout==0 or, if you keep an unbounded call, print a clear one-time warning that no timeout is set and GetAll may hang; update the code paths around fetchCtx/fetchCancel and the error handling after client.Blob.GetAll to implement these changes and surface useful context for debugging.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@tools/da-debug/main.go`:
- Around line 197-207: The goroutine is accepting b as any and then
type-asserting to interface{ Data() []byte }, which is inconsistent with
queryHeight that directly calls blob.Data(); fix by making the goroutine accept
the concrete type (either the actual blob type or the small interface interface{
Data() []byte }) instead of any and remove the type switch, or alternatively
pass blob.Data() into the goroutine directly (so the goroutine param becomes
[]byte). Update the goroutine signature where wg.Done() and results <-
blobResult are used, and ensure callers (including queryHeight) pass the same
concrete blob or blob.Data() to keep types consistent with blobResult and avoid
runtime assertions.
- Around line 172-188: The call to client.Blob.GetAll (using
fetchCtx/fetchCancel) currently swallows non-"future" errors and treats
timeout==0 as an unbounded call; change it to (1) log any non-"future" error
(include err, daHeight and ns in the message) instead of silently continuing so
users can diagnose failures, and (2) make the --timeout 0 behavior explicit by
either enforcing a sane default timeout (e.g., 30s) when timeout==0 or, if you
keep an unbounded call, print a clear one-time warning that no timeout is set
and GetAll may hang; update the code paths around fetchCtx/fetchCancel and the
error handling after client.Blob.GetAll to implement these changes and surface
useful context for debugging.
* main: refactor(tools/da-debug): improve da debug (#3199) fix(pkg/da): fix json rpc types (#3200) chore: demote warn log as debug (#3198) build(deps): Bump the all-go group across 1 directory with 2 updates (#3191) refactor: display block source in sync log (#3193) chore: remove cache analyzer (#3192) feat: add TestStatePressure benchmark for state trie stress test (#3188)
Overview
Small improvements for da debug tool
Summary by CodeRabbit
New Features
Improvements