idma_error_handler: gate WAIT_LAST_W transitions on eh_valid_i#97
Conversation
The error-handler FSM has two structurally identical wait states. In
WAIT, the decision logic is correctly gated on eh_valid_i:
WAIT : begin
if (eh_valid_i) begin
if (eh_i == idma_pkg::CONTINUE) ...
if (eh_i == idma_pkg::ABORT) ...
end
end
WAIT_LAST_W is missing this guard:
WAIT_LAST_W : begin
if (eh_i == idma_pkg::CONTINUE) ...
if (eh_i == idma_pkg::ABORT) ...
end
idma_eh_req_t is a 1-bit signal whose enum (idma_pkg.sv:27) is
{CONTINUE = 0, ABORT = 1}. With eh_valid_i deasserted (the typical
idle bus condition), eh_i defaults to 0 = CONTINUE, so the FSM
auto-CONTINUEs the cycle it enters WAIT_LAST_W and never waits for
the user's actual answer. The user has no way to ABORT a transfer
after a last-burst write error: by the time they decode the error,
the FSM has already taken the CONTINUE branch.
Reproducer (Verilator 5.046, MetaFifoDepth=4): release reset, drive
one write-error response with w_last_burst_i=1 (which IDLE catches
and routes to WAIT_LAST_W), keep eh_valid_i=0, observe the FSM.
Without the fix the FSM goes IDLE -> WAIT_LAST_W -> EMIT_EXTRA_RSP ->
IDLE in three cycles regardless of the user; with the fix it stays
in WAIT_LAST_W indefinitely until eh_valid_i is asserted.
Wrap the WAIT_LAST_W body in `if (eh_valid_i)`, mirroring WAIT.
There was a problem hiding this comment.
Pull request overview
This PR fixes the idma_error_handler FSM so that in WAIT_LAST_W it only consumes eh_i (and transitions) when eh_valid_i is asserted, preventing the idle value of a 1-bit eh_i from being misinterpreted as CONTINUE and prematurely advancing the state machine after a last-burst write error.
Changes:
- Gate
WAIT_LAST_Wdecision/transition logic behindeh_valid_i, mirroring the existingWAITstate behavior. - Ensure the frontend has an opportunity to provide an explicit
ABORTdecision before the FSM proceeds toEMIT_EXTRA_RSP/LEG_FLUSH.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Hi @flaviens, same accidental merge story as #96. For this one, PR #93 just landed on devel and addresses the same eh_valid gating issue, plus a couple of related guards in the surrounding states. That PR is older by a few months and has broader scope, so we kept it as the canonical fix. You found the same bug independently though, thanks for the catch. If there's something in your version that #93 misses, let us know and we'll add it as a follow-up. |
|
Sounds good! |
Hi there!
WAIT_LAST_W is missing the eh_valid_i guard.
Because idma_eh_req_t is 1 bit (CONTINUE = 0, ABORT = 1), the idle value of eh_i is interpreted as CONTINUE when eh_valid_i is low. As a result, after a last-burst write error, the FSM immediately takes the CONTINUE path on entry to WAIT_LAST_W, so the user never gets a chance to choose ABORT.
Reproducer: with MetaFifoDepth=4, drive a write-error response with w_last_burst_i=1 and keep eh_valid_i=0. Before the fix, the FSM transitions:
IDLE -> WAIT_LAST_W -> EMIT_EXTRA_RSP -> IDLE
With the fix, it remains in WAIT_LAST_W until eh_valid_i is asserted.
Thanks!
Flavien