pyAnalyzeLaurel: classify UserError diagnostics as known limitations#1118
pyAnalyzeLaurel: classify UserError diagnostics as known limitations#1118julesmt wants to merge 2 commits into
Conversation
When Laurel-to-Core translation fails, pyAnalyzeLaurel previously reported every failure as 'RESULT: Internal error' (exit 3), including legitimate user-facing limitations such as 'calls to procedures are not supported in functions or contracts'. These are UserError-level diagnostics; reporting them as internal errors conflates genuine compiler bugs with 'not yet supported' constructs. Classify by diagnostic type: only StrataBug triggers 'Internal error'. UserError and NotYetImplemented trigger 'Known limitation' (exit 4). If the batch contains both a StrataBug and a UserError, the Strata bug takes priority so real compiler issues are not hidden.
tautschnig
left a comment
There was a problem hiding this comment.
The direction is right — lumping user-facing limitations under Internal error / exit 3 was genuinely confusing, and downgrading NotYetImplemented to Known limitation / exit 4 is an improvement. No prior review threads or issue comments on this PR. The change is small, self-contained, and doesn't touch any other paths. My concerns are all about the classification design and test coverage; I'll call them out below.
1. UserError is collapsed into Known limitation, but the project's exit-code semantics say otherwise. The header comment at StrataMain.lean:55–60 is explicit:
Codes 1–2 are user-actionable (fix the input or the code under analysis).
Codes 3–4 are tool-side (report as a bug or wait for support).
So by the module's own rule:
| Diagnostic type | Intended exit class | This PR |
|---|---|---|
StrataBug |
tool-side / internalError (3) |
✅ internalError |
NotYetImplemented |
tool-side / knownLimitation (4) |
✅ knownLimitation |
UserError |
user-actionable / userError (1) |
knownLimitation (4) |
Warning |
n/a (shouldn't cause failure) | knownLimitation (4) |
The PR's framing argues that existing diagnostics labelled UserError are often "not supported" messages (e.g. PrimitiveOp {op} with {args.length} args is not supported at LaurelToCoreTranslator.lean:224, which is DiagnosticType.UserError). That's true — the diagnostic labelling in the translator is fuzzy. But the right place to fix that fuzziness is the diagnostic labels, not the classifier here. Otherwise a genuine user error (like "calls to procedures are not supported in functions or contracts" that the PR body cites) will land with exit 4 ("tool-side, wait for support") when it should land with exit 1 ("fix your code").
Concrete suggestion: priority-order the severities, matching the Python → Laurel path's tri-fold classification a few lines above (637–640):
let severity :=
if laurelTranslateErrors.any (·.type == .StrataBug) then .StrataBug
else if laurelTranslateErrors.any (·.type == .UserError) then .UserError
else if laurelTranslateErrors.any (·.type == .NotYetImplemented) then .NotYetImplemented
else .Warning -- only reachable if translation returned `none` with warnings-only, which is itself a bug
match severity with
| .StrataBug => exitPyAnalyzeInternalError msg
| .UserError => exitPyAnalyzeUserError msg
| .NotYetImplemented => exitPyAnalyzeKnownLimitation msg
| .Warning => exitPyAnalyzeInternalError msg -- failure with no real diagnosticTakes the same worst-wins shape as the current hasStrataBug check but extends it to the full enum. Addresses the conflation without changing the spirit of the PR.
If the diagnostic labels in the translator really are inaccurate (e.g. .UserError used where .NotYetImplemented would be more honest), that's a separate, prior bug; fixing the labels at the call sites is the right scope. Happy to suggest a follow-up PR that relabels the specific "not supported yet" sites.
2. No tests. This is a behavioural change to exit codes and the RESULT: line — exactly the class of change that shell-based regression tests catch. The repo has precedent for these: StrataTest/Languages/Python/expected_laurel/*.expected snapshots pin down the DETAIL:/RESULT: lines for cases that hit exitPyAnalyzeUserError, and adding one more snapshot file plus a tiny Python fixture that reproducibly triggers DiagnosticType.UserError in Laurel-to-Core would nail down the new behaviour. Two more fixtures:
- A
StrataBugfixture (e.g. a Laurel program that triggers.prim .realtranslation atLaurelToCoreTranslator.lean:107which emitsDiagnosticType.StrataBug) — expectRESULT: Internal error, exit 3. This is the "regression guard": if the classifier ever flips to lump StrataBug into limitations, this test catches it. - A
NotYetImplementedfixture — expectRESULT: Known limitation, exit 4. - A mixed fixture (same run emits both
StrataBugandUserError) — expectRESULT: Internal error, exit 3 (StrataBug wins). This is the only test that actually exercises thehasStrataBugpriority logic.
Without those three, a future "simplify" refactor could silently flip the classification.
3. Edge cases currently unhandled.
- Empty
laurelTranslateErrorswithcoreProgramOption = none. This shouldn't happen (how does translation fail with no diagnostics?) — but if it does,any (· == .StrataBug)isfalse, so the PR classifies it asKnown limitation. That's the wrong classification for a "shouldn't happen" situation; it should beInternal errorsince anone-without-diagnostics is itself a Strata bug. Trivial to add:if laurelTranslateErrors.isEmpty || hasStrataBug then internal else limitation. - Warnings-only failure. Same concern: a list of only
.Warningdiagnostics withcoreProgramOption = noneimplies translation failed silently.
The severity-match suggestion above handles both.
4. Message formatting. s!"Laurel to Core translation failed: {laurelTranslateErrors}" dumps the full List DiagnosticModel via ToString. In an Internal error context that's fine (we want raw diagnostics for debugging). In a Known limitation context, a user-facing DETAIL: line should probably render the diagnostics through DiagnosticModel.format against the known fileMap, the way Python → Laurel user errors at line 637 do. Not blocking; worth a follow-up.
5. Refactoring. The Python → Laurel block at 637–640 and this new block both classify errors; the logic isn't quite the same shape (one matches a PipelineError sum, the other walks a diagnostic list), but a shared helper classifyAndExit : List DiagnosticModel → String → IO α would let both sites (and any future third site) stay in lockstep. Again, follow-up-able.
Co-authored-by: Michael Tautschnig <mt@debian.org>
tautschnig
left a comment
There was a problem hiding this comment.
Still outstanding from my earlier review body:
Tests for the three classifier arms. The PR remains test-free.
- StrataBug fixture — expect
RESULT: Internal error, exit 3. Regression guard against anyone re-flattening the classifier back to "everything is a limitation". - UserError-only fixture — expect
RESULT: User error, exit 1. This arm was the whole point of the PR; a silent regression that sends UserError back toknownLimitationwould otherwise be invisible. - NotYetImplemented-only fixture — expect
RESULT: Known limitation, exit 4. - Mixed StrataBug + UserError fixture — expect
RESULT: Internal error, exit 3 (StrataBug wins). This is the only test that exercises the priority ordering the new classifier encodes. - Empty-diagnostics / warnings-only fixtures — expect
RESULT: Internal error, exit 3. Pins the defensive arms so they can't silently flip back toknownLimitation.
The repository's expected_laurel/*.expected snapshot infrastructure handles exactly this shape of test; four to five small Python fixtures + matching .expected files would lock the classifier down.
Desired improvement 1: Message formatting at line 660: s!"Laurel to Core translation failed: {laurelTranslateErrors}" still dumps the raw List DiagnosticModel via ToString. In the internalError arm that's fine for debugging; in userError / knownLimitation arms a user-facing DETAIL: line should render through DiagnosticModel.format against the fileMap the way exitPyAnalyzeUserError does at line 637. This is a presentation nit; worth a small follow-up.
Desired improvement 2: Shared classifier helper. The Python→Laurel block (637–640) and this Laurel→Core block (661–675) both classify-and-exit over a diagnostic stream; the two now have very similar shapes (three-way split + defensive fallthrough). A shared classifyAndExit : List DiagnosticModel → String → IO α helper would keep them in lockstep and remove the slight shape mismatch.
When Laurel-to-Core translation fails, pyAnalyzeLaurel previously reported every failure as 'RESULT: Internal error' (exit 3), including legitimate user-facing limitations such as 'calls to procedures are not supported in functions or contracts'. These are UserError-level diagnostics; reporting them as internal errors conflates genuine compiler bugs with 'not yet supported' constructs.
Classify by diagnostic type: only StrataBug triggers 'Internal error'. UserError and NotYetImplemented trigger 'Known limitation' (exit 4). If the batch contains both a StrataBug and a UserError, the Strata bug takes priority so real compiler issues are not hidden.
Issue #, if available:
Description of changes:
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.