blockifier: omit cairo-vm-specific VM frames from SierraGas revert traces#14251
blockifier: omit cairo-vm-specific VM frames from SierraGas revert traces#14251ron-starkware wants to merge 1 commit into
Conversation
PR SummaryHigh Risk Overview
Reviewed by Cursor Bugbot for commit 8569134. Bugbot is set up for automated code reviews on this repo. Configure here. |
|
Artifacts upload workflows: |
4ea7052 to
99bf4d0
Compare
839a939 to
da29821
Compare
Yoni-Starkware
left a comment
There was a problem hiding this comment.
@Yoni-Starkware made 2 comments.
Reviewable status: 0 of 23 files reviewed, 2 unresolved discussions (waiting on dorimedini-starkware, matanl-starkware, and ron-starkware).
crates/blockifier/src/execution/stack_trace.rs line 720 at r2 (raw file):
strip_vm_frames_in_sierra_gas, } => (*tracked_resource, *strip_vm_frames_in_sierra_gas, inner.as_ref()), other => (TrackedResource::CairoSteps, false, other),
It's safer to set omit_vm_frame to false inside the other arm rather than using dummy values for tracked resources etc
crates/blockifier/src/execution/stack_trace.rs line 743 at r2 (raw file):
error_stack.push(error_trace.clone().into()) } // Defensive: `annotated()` doesn't nest, but if anyone hand-constructs a nested
If that's the intended use, better to forbid this flow and panic IMO
…aces
For SierraGas-mode (Cairo 1) contracts the choice of execution backend
(cairo-vm CASM vs cairo-native) is meant to be a free implementation
detail, but blockifier's stack-trace formatter was committing the
backend choice into the rendered revert reason: cairo-vm produces a
`VmException` with a PC and a Cairo traceback, cairo-native does not,
and `stack_trace.rs` faithfully rendered the former and skipped the
latter. Because the revert reason is hashed verbatim into the receipt
commitment (and thus the block hash), the same Cairo 1 contract
reverting on the same input produced two different receipts depending
on which backend was hot in the contract class manager's cache at the
moment of execution. This is what bites echonet replays of mainnet: any
mainnet block where a Cairo 1 contract reverts can mismatch echonet's
receipt commitment iff echonet ran that class via native while mainnet
ran it via CASM (or vice versa).
This patch makes the revert reason backend-invariant for SierraGas
frames, gated behind a new `VersionedConstants` flag
`strip_vm_frames_in_sierra_gas` that activates at protocol v0.14.3.
Pre-v0.14.3 blocks continue to render the legacy format byte-for-byte
so historical receipts replay cleanly; v0.14.3+ blocks render the new
stripped format that's invariant under execution backend.
Mechanics:
* `execute_entry_point_call_wrapper` annotates every returned Err
with the executing contract's `TrackedResource` AND the protocol's
active `strip_vm_frames_in_sierra_gas` policy via a new
`EntryPointExecutionError::Annotated` variant. Annotation happens
at exactly one site so the wrapping policy is easy to audit.
* The formatter peels that annotation in
`extract_entry_point_execution_error_into_stack_trace` and only
emits the `Error at pc / Cairo traceback` block when either the
frame is `CairoSteps`-tracked (Cairo 0 — keep the PC/traceback,
they're the only useful signal) or the strip policy is off for
that block's protocol version (legacy compat).
* Cairo 0 traces are unchanged at every protocol version. Native
traces are unchanged. A SierraGas-mode contract at v0.14.3+ now
renders identically whether it ran via cairo-vm CASM or cairo-
native; at pre-v0.14.3 the legacy format is preserved.
The patch covers the full divergence surface. The `VmExceptionFrame`
block is the only backend-specific text in revert traces for
SierraGas-mode contracts: `SyscallHintProcessor::consume_step`
(`syscalls/hint_processor.rs:755`) is a no-op when the tracked resource
is `SierraGas`, so cairo-vm's `n_steps` cap cannot bite for V1 CASM
execution — only Sierra gas can exhaust, which both V1 CASM and V1Native
render as a Cairo-1 `Out of gas` panic through the existing
`ExecutionFailed { error_trace: Cairo1RevertSummary }` path. The
`Could not reach the end of the program. RunResources has no remaining
steps.` leaf message can only originate from a Cairo-0 (CairoSteps)
frame, which has no native counterpart and so cannot diverge.
Pattern-match sites that destructure `EntryPointExecutionError` get
new `unannotated()` / `into_unannotated()` helpers so they don't have
to know about the wrapping.
Locally verified with the `blockifier_reexecution --compare-native`
tool on mainnet block 6481044 (containing two reverted txs
`0x4d246b5c…` and `0x139e5fa8…` whose revert_error was the original
motivator for this patch): with the strip policy active, native and
CASM produce byte-identical revert reasons across all 26 txs in the
block; with the policy off, the legacy receipts are reproduced.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
da29821 to
8569134
Compare
ron-starkware
left a comment
There was a problem hiding this comment.
@ron-starkware made 2 comments and resolved 1 discussion.
Reviewable status: 0 of 23 files reviewed, 1 unresolved discussion (waiting on dorimedini-starkware, matanl-starkware, and Yoni-Starkware).
crates/blockifier/src/execution/stack_trace.rs line 720 at r2 (raw file):
Previously, Yoni-Starkware (Yoni) wrote…
It's safer to set omit_vm_frame to
falseinside theotherarm rather than using dummy values for tracked resources etc
Right. Is this better?
crates/blockifier/src/execution/stack_trace.rs line 743 at r2 (raw file):
Previously, Yoni-Starkware (Yoni) wrote…
If that's the intended use, better to forbid this flow and panic IMO
Done
dorimedini-starkware
left a comment
There was a problem hiding this comment.
@dorimedini-starkware reviewed 23 files and all commit messages, and made 13 comments.
Reviewable status: all files reviewed, 14 unresolved discussions (waiting on matanl-starkware, ron-starkware, and Yoni-Starkware).
crates/blockifier/src/execution/errors.rs line 103 at r3 (raw file):
tracked_resource: TrackedResource, strip_vm_frames_in_sierra_gas: bool, },
this may incur a lot of refactoring, but - I think this could be better.
otherwise the compiler won't stop you from "nesting" annotations. you later have an unreachable! call due to this - causing potential panics in the code that is only trying to construct user hints for why the tx is reverted, is not optimal..
WDYT?
Suggestion:
#[derive(Debug, Error)]
pub enum EntryPointExecutionErrorWithMetadata {
/// Tags an error with the executing contract's `TrackedResource` and the active
/// `strip_vm_frames_in_sierra_gas` policy, for the stack-trace formatter to consume.
/// Constructed only by `execute_entry_point_call_wrapper`.
#[error("{inner}")]
Annotated {
#[source]
inner: EntryPointExecutionError,
tracked_resource: TrackedResource,
strip_vm_frames_in_sierra_gas: bool,
},
#[error(transparent)]
UnAnnotated(EntryPointExecutionError)
}
#[derive(Debug, Error)]
pub enum EntryPointExecutionError {crates/blockifier/src/execution/stack_trace.rs line 745 at r3 (raw file):
EntryPointExecutionError::Annotated { .. } => { unreachable!("nested `Annotated` is forbidden; construct via `annotated()` only") }
see above - adding potential panic in code that really should be best-effort at best
Code quote:
EntryPointExecutionError::Annotated { .. } => {
unreachable!("nested `Annotated` is forbidden; construct via `annotated()` only")
}crates/blockifier/src/execution/stack_trace_test.rs line 1091 at r3 (raw file):
use cairo_vm::vm::errors::cairo_run_errors::CairoRunError; use cairo_vm::vm::errors::vm_errors::VirtualMachineError; use cairo_vm::vm::errors::vm_exception::VmException;
move to top of module
Code quote:
use cairo_vm::types::relocatable::Relocatable;
use cairo_vm::vm::errors::cairo_run_errors::CairoRunError;
use cairo_vm::vm::errors::vm_errors::VirtualMachineError;
use cairo_vm::vm::errors::vm_exception::VmException;crates/blockifier/src/execution/stack_trace_test.rs line 1109 at r3 (raw file):
/// the formatted revert string. fn render_revert_for( annotation: Option<(crate::execution::contract_class::TrackedResource, bool)>,
use this please
Code quote:
crate::execution::contract_class::TrackedResourcecrates/blockifier/src/execution/stack_trace_test.rs line 1112 at r3 (raw file):
) -> String { use crate::execution::stack_trace::gen_tx_execution_error_trace; use crate::transaction::errors::TransactionExecutionError;
move to top
Code quote:
use crate::execution::stack_trace::gen_tx_execution_error_trace;
use crate::transaction::errors::TransactionExecutionError;crates/blockifier/src/execution/stack_trace_test.rs line 1122 at r3 (raw file):
error: Box::new(top_error), class_hash: ClassHash(felt!("0xabc")), storage_address: ContractAddress::try_from(felt!("0x123")).unwrap(),
we have macros for these
Suggestion:
class_hash: class_hash!("0xabc"),
storage_address: contract_address!("0x123"),crates/blockifier/src/execution/stack_trace_test.rs line 1133 at r3 (raw file):
#[test] fn test_sierra_gas_frame_omits_vm_exception_block_when_strip_enabled() { use crate::execution::contract_class::TrackedResource;
move to top
Code quote:
use crate::execution::contract_class::TrackedResource;crates/blockifier/src/execution/stack_trace_test.rs line 1157 at r3 (raw file):
#[test] fn test_sierra_gas_frame_keeps_vm_exception_block_when_strip_disabled() { use crate::execution::contract_class::TrackedResource;
move to top
Code quote:
use crate::execution::contract_class::TrackedResource;crates/blockifier/src/execution/stack_trace_test.rs line 1176 at r3 (raw file):
#[test] fn test_cairo_steps_frame_keeps_vm_exception_block_under_either_policy() { use crate::execution::contract_class::TrackedResource;
move to top
Code quote:
use crate::execution::contract_class::TrackedResource;crates/blockifier/src/execution/stack_trace_test.rs line 1190 at r3 (raw file):
); } }
utilize auto-parallelization of tests with #[values]
Suggestion:
#[rstest]
fn test_cairo_steps_frame_keeps_vm_exception_block_under_either_policy(#[values(true, false)] strip: bool) {
use crate::execution::contract_class::TrackedResource;
let rendered = render_revert_for(Some((TrackedResource::CairoSteps, strip)));
assert!(
rendered.contains("Error at pc=0:42:"),
"CairoSteps-mode revert (strip={strip}) must include `Error at pc=` (got: \
{rendered:?})"
);
assert!(
rendered.contains("Cairo traceback (most recent call last):"),
"CairoSteps-mode revert (strip={strip}) must include `Cairo traceback` (got: \
{rendered:?})"
);
}crates/blockifier/src/execution/stack_trace_test.rs line 1211 at r3 (raw file):
use crate::execution::contract_class::TrackedResource; use crate::execution::stack_trace::gen_tx_execution_error_trace; use crate::transaction::errors::TransactionExecutionError;
move to top
Code quote:
use crate::execution::contract_class::TrackedResource;
use crate::execution::stack_trace::gen_tx_execution_error_trace;
use crate::transaction::errors::TransactionExecutionError;crates/blockifier/src/execution/stack_trace_test.rs line 1219 at r3 (raw file):
let double = inner .annotated(TrackedResource::SierraGas, true) .annotated(TrackedResource::CairoSteps, false);
see above - I would prefer the compiler stops you from doing this.
if this flow is accidentally implemented, who is to say the first annotation is the correct one?
Code quote:
let double = inner
.annotated(TrackedResource::SierraGas, true)
.annotated(TrackedResource::CairoSteps, false);crates/blockifier/src/execution/stack_trace_regression/test_contract_ctor_frame_stack_trace_cairo1_casm.txt line 4 at r3 (raw file):
0: Error in the called contract (contract address: 0x00000000000000000000000000000000000000000000000000000000c0020000, class hash: 0x0000000000000000000000000000000000000000000000000000000080020000, selector: 0x015d40a3d6ca2ac30f4031e42be28da9b056fef9bb7357ac5e85627ee876e5ad): Error at pc=0:443: 1: Error in the called contract (contract address: 0x00000000000000000000000000000000000000000000000000000000c0020000, class hash: 0x0000000000000000000000000000000000000000000000000000000080020000, selector: 0x02730079d734ee55315f4f141eaed376bddd8c2133523d223a344c5604e0f7f8):
I am confused as to what we want to happen:
- this is a CASM output. why are the first two frames stripped of PC locations?
- isn't there a
nativevariant of this regression txt in which I can see the effect of a native run? if not, please add a parametrized test that shows the different outcome of identical "full flows", where one starts in native and one starts in VM
Code quote:
Transaction execution has failed:
0: Error in the called contract (contract address: 0x00000000000000000000000000000000000000000000000000000000c0020000, class hash: 0x0000000000000000000000000000000000000000000000000000000080020000, selector: 0x015d40a3d6ca2ac30f4031e42be28da9b056fef9bb7357ac5e85627ee876e5ad):
Error at pc=0:443:
1: Error in the called contract (contract address: 0x00000000000000000000000000000000000000000000000000000000c0020000, class hash: 0x0000000000000000000000000000000000000000000000000000000080020000, selector: 0x02730079d734ee55315f4f141eaed376bddd8c2133523d223a344c5604e0f7f8):
Error at pc=0:797:
2: Error in the contract class constructor (contract address: 0x0103ee82605273496eed8d9141c5b3ad967baa08be63aa5bc49ffae5eae454cc, class hash: 0x0000000000000000000000000000000000000000000000000000000080040000, selector: 0x028ffe4ff0f226a9107253e17a904099aa4f63a02a5621de0576e5aa71bc5194):
dorimedini-starkware
left a comment
There was a problem hiding this comment.
@dorimedini-starkware made 1 comment.
Reviewable status: all files reviewed, 14 unresolved discussions (waiting on matanl-starkware, ron-starkware, and Yoni-Starkware).
crates/blockifier/src/execution/errors.rs line 103 at r3 (raw file):
Previously, dorimedini-starkware wrote…
this may incur a lot of refactoring, but - I think this could be better.
otherwise the compiler won't stop you from "nesting" annotations. you later have anunreachable!call due to this - causing potential panics in the code that is only trying to construct user hints for why the tx is reverted, is not optimal..
WDYT?
to implement this, I would start with a PR that defines EntryPointExecutionErrorWithMetadata with only the UnAnnotated variant, and replace usage of EntryPointExecutionError with this new enum; the next PR can add the Annotated variant and implement the logic you have in this current PR
For SierraGas-mode (Cairo 1) contracts the choice of execution backend
(cairo-vm CASM vs cairo-native) is meant to be a free implementation
detail, but blockifier's stack-trace formatter was committing the
backend choice into the rendered revert reason: cairo-vm produces a
VmExceptionwith a PC and a Cairo traceback, cairo-native does not,and
stack_trace.rsfaithfully rendered the former and skipped thelatter. Because the revert reason is hashed verbatim into the receipt
commitment (and thus the block hash), the same Cairo 1 contract
reverting on the same input produced two different receipts depending
on which backend was hot in the contract class manager's cache at the
moment of execution. This is what bites echonet replays of mainnet: any
mainnet block where a Cairo 1 contract reverts can mismatch echonet's
receipt commitment iff echonet ran that class via native while mainnet
ran it via CASM (or vice versa).
This patch makes the revert reason backend-invariant for SierraGas
frames, gated behind a new
VersionedConstantsflagstrip_vm_frames_in_sierra_gasthat activates at protocol v0.14.3.Pre-v0.14.3 blocks continue to render the legacy format byte-for-byte
so historical receipts replay cleanly; v0.14.3+ blocks render the new
stripped format that's invariant under execution backend.
Mechanics:
execute_entry_point_call_wrapperannotates every returned Errwith the executing contract's
TrackedResourceAND the protocol'sactive
strip_vm_frames_in_sierra_gaspolicy via a newEntryPointExecutionError::Annotatedvariant. Annotation happensat exactly one site so the wrapping policy is easy to audit.
extract_entry_point_execution_error_into_stack_traceand onlyemits the
Error at pc / Cairo tracebackblock when either theframe is
CairoSteps-tracked (Cairo 0 — keep the PC/traceback,they're the only useful signal) or the strip policy is off for
that block's protocol version (legacy compat).
traces are unchanged. A SierraGas-mode contract at v0.14.3+ now
renders identically whether it ran via cairo-vm CASM or cairo-
native; at pre-v0.14.3 the legacy format is preserved.
The patch covers the full divergence surface. The
VmExceptionFrameblock is the only backend-specific text in revert traces for
SierraGas-mode contracts:
SyscallHintProcessor::consume_step(
syscalls/hint_processor.rs:755) is a no-op when the tracked resourceis
SierraGas, so cairo-vm'sn_stepscap cannot bite for V1 CASMexecution — only Sierra gas can exhaust, which both V1 CASM and V1Native
render as a Cairo-1
Out of gaspanic through the existingExecutionFailed { error_trace: Cairo1RevertSummary }path. TheCould not reach the end of the program. RunResources has no remaining steps.leaf message can only originate from a Cairo-0 (CairoSteps)frame, which has no native counterpart and so cannot diverge.
Pattern-match sites that destructure
EntryPointExecutionErrorgetnew
unannotated()/into_unannotated()helpers so they don't haveto know about the wrapping.
Locally verified with the
blockifier_reexecution --compare-nativetool on mainnet block 6481044 (containing two reverted txs
0x4d246b5c…and0x139e5fa8…whose revert_error was the originalmotivator for this patch): with the strip policy active, native and
CASM produce byte-identical revert reasons across all 26 txs in the
block; with the policy off, the legacy receipts are reproduced.