Core.formatProgram to produce round-trip-parseable output for all constructs#1165
Core.formatProgram to produce round-trip-parseable output for all constructs#1165MikaelMayer wants to merge 10 commits into
Conversation
- Fix inline function formatting: add trailing space to 'inline' op in Grammar.lean to prevent 'inlinefunction' concatenation - Emit inline attribute when formatting functions that have it - Use prettyName for quantifier variables instead of generated __qN names - Add overflow predicate support: grammar entries, formatter handling, parser entries, and Factory operations for Bv32 overflow predicates - Add roundtrip test infrastructure for Core programs - Update test expected outputs to reflect prettyName fix
Resolve conflicts between the formatProgram roundtrip fix and main's refactoring (FormatCore extraction, recFnDeclToCST, procedure syntax). - Apply prettyName and inline? fixes to FormatCore.lean - Use proper overflow CST constructors instead of approximations - Remove duplicate overflow declarations in Factory.lean - Update test expected outputs for combined changes - Fix RoundtripTest to handle new 'program Core;' header
MikaelMayer
left a comment
There was a problem hiding this comment.
🔍 Good PR — the prettyName preservation, inline function formatting, and overflow predicate grammar entries are well-designed. The roundtrip test infrastructure is a valuable addition.
One substantive issue: the overflow ops in Translate.lean ignore the type parameter and hardcode bv32, which breaks roundtrip for non-32-bit overflow predicates.
Minor: the docstring on mkQuantVarName (line 86 of FormatCore.lean) says "In the future, we will store existing variable names in an extra field" — but this PR implements exactly that. Worth updating to reflect the current state.
… grammar entry - Overflow ops in translateFn now match on type parameter (.some .bv1, .some .bv8, etc.) instead of using wildcard _, consistent with all other type-parameterized BV ops. - Added bv_sdiv_overflow grammar entry so SDivOverflow formats as Bv.SDivOverflow(x, y) instead of the incorrect x <= y. - Added overflow ops to the polymorphic binary dispatch in translateExpr. - Updated editor syntax files with Bv.SDivOverflow.
|
Do all the follow-ups and update the PR description when done |
… roundtrip - Implement lhsArray (m[k] := v) in Translate.lean by decomposing nested LHS into base identifier + indices and building map_update expressions - Add decomposeMapUpdate in FormatCore.lean to detect map_update patterns and format them back as lhsArray syntax for proper roundtrip - Add seq_empty grammar entry with explicit type annotation syntax (Sequence.empty<T>()) and corresponding translate/format handling - Add full roundtrip test for datatypes (parentheses issue is resolved) - Add roundtrip tests for lhsArray and Sequence.empty - Regenerate editor syntax files with Sequence.empty - Update AdvancedMaps expected output to match new lhsArray formatting
|
🤖 Done in 4683739. Addressed the follow-ups:
All 598 test jobs pass. |
…tprogram-does-not-produce-roun
MikaelMayer
left a comment
There was a problem hiding this comment.
🔍 Well-structured PR that addresses multiple formatting roundtrip issues. The roundtrip test infrastructure is a good addition. The quantifier pretty-name preservation, inline attribute emission, overflow predicate grammar entries, and lhsArray detection are all solid improvements.
A few items to address:
- Dead code added in Factory.lean (
bv32NegOverflowFunc/bv32NegOverflowOp) — the macro already generates these. UNegOverflowis conflated withSNegOverflowin the formatter/parser roundtrip (they use different semantics:BitVec.negOverflowvsBitVec.unegOverflow).decomposeMapUpdatedocstring says "nested" but only handles single-level.
| def bv64Extract_15_0_Op := bv64Extract_15_0_Func.opExpr | ||
| def bv64Extract_7_0_Op := bv64Extract_7_0_Func.opExpr | ||
|
|
||
| def bv32NegOverflowFunc : WFLFunc CoreLParams := |
There was a problem hiding this comment.
🔍 bv32NegOverflowFunc and bv32NegOverflowOp are dead code — they are never referenced anywhere. The macro DefBVOpFuncExprs [1, 8, 16, 32, 64] already generates bv32SNegOverflowOp (from the "SNegOverflow" spec in BVOpSpecs). These manual definitions should be removed.
There was a problem hiding this comment.
🤖 Removed bv32NegOverflowFunc and bv32NegOverflowOp in c3b37cd.
| -- Overflow predicates: approximated as Bool.Not for CST printing | ||
| | .bv ⟨_, .SNegOverflow⟩ | .bv ⟨_, .UNegOverflow⟩ => pure (.not default arg) | ||
| -- Overflow predicates | ||
| | .bv ⟨w, .SNegOverflow⟩ | .bv ⟨w, .UNegOverflow⟩ => |
There was a problem hiding this comment.
🔍 SNegOverflow and UNegOverflow are mapped to the same grammar entry (bv_neg_overflow), but the parser always maps bv_neg_overflow back to Core.bv{N}SNegOverflowOp. Since these use different evaluation functions (BitVec.negOverflow vs BitVec.unegOverflow), a UNegOverflow expression loses its semantics after a roundtrip. Consider adding a separate bv_uneg_overflow grammar entry (e.g., "Bv.UNegOverflow") to preserve the distinction.
There was a problem hiding this comment.
🤖 Added separate bv_uneg_overflow grammar entry (Bv.UNegOverflow) and updated FormatCore to use it for UNegOverflow, while bv_neg_overflow now formats as Bv.SNegOverflow. Parser mappings added in Translate.lean. c3b37cd
| modify (·.pushBoundVar name.val) | ||
| pure (.funcDecl_statement default name typeArgs b r preconds bodyExpr inline?) | ||
|
|
||
| /-- Decompose nested `map_update(base, idx, val)` where `base` is (or starts |
There was a problem hiding this comment.
🔍 The docstring says "Decompose nested map_update" but the implementation only handles single-level (base must be an fvar). For nested m[k1][k2] := v, this returns some ([k1], update(select(m, k1), k2, v)) which formats as m[k1] := (m[k1])[k2:=v] rather than m[k1][k2] := v. Consider updating the docstring to say "single-level" to match the actual behavior, or implementing the recursive case by checking if val is itself a map_update whose base is a select on the same variable.
There was a problem hiding this comment.
🤖 Updated docstring to say "single-level" in c3b37cd.
…de, fix docstring - Add separate bv_uneg_overflow grammar entry so UNegOverflow roundtrips correctly (previously both mapped to bv_neg_overflow losing semantics) - Remove dead bv32NegOverflowFunc/bv32NegOverflowOp definitions - Fix decomposeMapUpdate docstring to say single-level instead of nested - Regenerate editor syntax files
Fixes #1158
Summary
Fixes several issues with
Core.formatProgramthat prevented round-trip parsing:inline functionformatted without space — Theinlinegrammar op now includes a trailing space, preventinginlinefunctionconcatenation. The formatter also now emits theinlineattribute when present on functions.Quantifier variable names — The formatter now uses the
prettyNamefield fromLExpr.quantinstead of generating__qNnames, preserving the original variable names through formatting.Overflow predicates — Added grammar entries, formatter handling, parser entries, and Factory operations for all bitvector overflow predicates (
SNegOverflow,UNegOverflow,SAddOverflow,SSubOverflow,SMulOverflow,SDivOverflow,UAddOverflow,USubOverflow,UMulOverflow).SNegOverflowandUNegOverflowuse distinct grammar entries (Bv.SNegOverflow/Bv.UNegOverflow) to preserve their different semantics through roundtrip. The translate direction correctly dispatches overflow ops by type parameter (bv1/bv8/bv16/bv32/bv64).Array assignment (
lhsArray) — Implementedm[k] := vtranslation in both directions: the parser decomposes nested LHS into base identifier + indices and buildsmap_updateexpressions; the formatter detects themap_update(var, idx, val)pattern and produceslhsArraysyntax.Sequence.empty— Added grammar entry with explicit type annotation syntax (Sequence.empty<int>()), plus translate and format handling, resolving the 0-ary polymorphic function limitation for this operation.Datatype roundtrip — Verified that datatype formatting roundtrips correctly (the extra-parentheses issue noted earlier is resolved).
Roundtrip test infrastructure — Added
RoundtripTest.leanthat verifies parse → format → re-parse → re-format produces stable output for types, functions, procedures, inline functions, parameterized type arguments, datatypes, array assignments, andSequence.empty.Testing
Remaining limitations