Document the Python front-end by tautschnig · Pull Request #1136 · strata-org/Strata

tautschnig · 2026-05-07T10:23:25Z

Adds two complementary documents:

docs/PythonFrontend.md — a narrative, human-oriented reference describing how Python source is turned into a Core program. Covers the .py → Laurel → Core → SMT/GOTO pipeline, the Any-boxed value encoding (with the datatype definition), the PySpec mechanism for external libraries such as boto3, the supported Python subset and the constructs that are rejected or collapsed, and the soundness posture (tag checks are the contract, PySpec is trusted, aliasing is a known unsoundness). It includes a 'where to look in the code' table linking to the main Lean files involved at each stage.
.kiro/steering/python-frontend-debugging.md — a compact playbook for agents debugging pyAnalyzeLaurel / pyAnalyzeLaurelToGoto failures. Scoped via fileMatch so it activates only when editing Strata/Languages/Python/, Strata/Languages/Laurel/, or the corresponding tests. Lists: a pipeline map correlating error prefixes with the stage that emits them; the seven invariants of the Any-boxed encoding; catalogued failure classes ('Resolution failed', 'could not infer type', 'calls to procedures in functions or contracts', 'holes should have been eliminated', 'N passed, M inconclusive', false alarms on clean benchmarks, GOTO-side issues) with where-to-look pointers; a pre-edit checklist; and the architectural limits coming from Strata's type system that should not be worked around locally.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Copilot

Pull request overview

Adds documentation for Strata’s Python front-end: an end-to-end narrative of the .py → Laurel → Core → SMT/GOTO pipeline plus an agent-facing debugging playbook scoped to the relevant Lean code areas.

Changes:

Added docs/PythonFrontend.md describing the pipeline, the Any boxing encoding, PySpec integration, supported/rejected Python constructs, and soundness assumptions.
Added .kiro/steering/python-frontend-debugging.md as a troubleshooting map for common pyAnalyzeLaurel / pyAnalyzeLaurelToGoto failures, with stage pointers and invariants.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File	Description
docs/PythonFrontend.md	New narrative reference for how Python is translated and verified/compiled within Strata.
.kiro/steering/python-frontend-debugging.md	New scoped debugging playbook for investigating Python front-end pipeline failures.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Adds two complementary documents: * docs/PythonFrontend.md — a narrative, human-oriented reference describing how Python source is turned into a Core program. Covers the .py → Laurel → Core → SMT/GOTO pipeline, the Any-boxed value encoding (with the datatype definition), the PySpec mechanism for external libraries such as boto3, the supported Python subset and the constructs that are rejected or collapsed, and the soundness posture (tag checks are the contract, PySpec is trusted, aliasing is a known unsoundness). It includes a 'where to look in the code' table linking to the main Lean files involved at each stage. * .kiro/steering/python-frontend-debugging.md — a compact playbook for agents debugging pyAnalyzeLaurel / pyAnalyzeLaurelToGoto failures. Scoped via fileMatch so it activates only when editing Strata/Languages/Python/**, Strata/Languages/Laurel/**, or the corresponding tests. Lists: a pipeline map correlating error prefixes with the stage that emits them; the seven invariants of the Any-boxed encoding; catalogued failure classes ('Resolution failed', 'could not infer type', 'calls to procedures in functions or contracts', 'holes should have been eliminated', 'N passed, M inconclusive', false alarms on clean benchmarks, GOTO-side issues) with where-to-look pointers; a pre-edit checklist; and the architectural limits coming from Strata's type system that should not be worked around locally. Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>

remi-delmas-3000 · 2026-05-07T19:35:16Z

+
+## Overview
+
+The Python front-end translates annotated Python programs into Strata's Core


programs annotated with what kind of information ?

remi-delmas-3000 · 2026-05-07T19:36:38Z

+verifier (`pyAnalyzeLaurel`) or compiled through to CBMC GOTO programs
+(`pyAnalyzeLaurelToGoto`).
+
+The front-end is intentionally scoped. It does not attempt to model arbitrary


suggestion: the front-end is scoped down to a fragment of python

remi-delmas-3000 · 2026-05-07T19:39:45Z

+procedural code that invokes library APIs described by PySpec, raises a
+handful of exception types, and avoids the more dynamic corners of the
+language. Within that subset the aim is *sound* analysis — if the tool
+reports "no bugs" then, modulo known specification gaps, there really are


modulo known specification gaps
this introduces doubt, are these gaps in the semantics of the tool accepts as input ? If we know the gaps then why not scope down the fragment even further to reject these constructs and give unconditionally sound results ?

remi-delmas-3000 · 2026-05-07T19:42:09Z

+1. The pipeline from a Python `.py` file to a Core program.
+2. The `Any`-boxed value encoding that underpins everything the front-end
+   produces.
+3. The PySpec mechanism that describes external library functions.


So the pyspec .ion files are contracts for all external dependencies of the .py file on this diagram ?

why this asymetry between .py and .ion files ? why not use regular python signatures with contract annotations to specify external dependencies ?

remi-delmas-3000 · 2026-05-07T19:46:26Z

+
+* A Python `int` literal `42` becomes `from_int(42)`.
+* `x + y` on Python `int`s becomes
+  `from_int(Any..as_int!(x) + Any..as_int!(y))`, bracketed by tag checks.


is any of that boxing/unboxing statically resolved before or during the translation to SMT or GOTO or do we generate quantifier heavy SMT encodings and let the SMT solver resolve all boxing and unboxing ?

remi-delmas-3000 · 2026-05-07T20:45:16Z

+
+Mutable containers (`list`, `dict`) are encoded as *value-semantics*
+inductive datatypes (`ListAny`, `DictStrAny`), not as heap references. This
+means the model does not capture aliasing: after `c = a; a.append(4)`, the


is this a hard limitation or do we have a path to soundness on this point ?

remi-delmas-3000 · 2026-05-07T20:45:38Z

+* **Tag checks are the contract.** Every place a Core expression consumes a
+  specific constructor (e.g. `as_int`) must be preceded by a Strata assertion
+  that the corresponding `isfrom_*` holds. The translator is responsible for
+  inserting these; a missing one is a soundness bug.


what's our path to soundness here ? lean proofs about how the translator inserts tag checking assertions ?

remi-delmas-3000 · 2026-05-07T20:47:30Z

+  postcondition, the verifier trusts it. Specs generated from upstream API
+  descriptions inherit the same trust.
+* **Holes are sound by construction.** Each hole becomes a fresh
+  uninterpreted procedure; the verifier treats its outputs as arbitrary


uninterpreted as in x = y => f(x) = f(y) ? is this really sound ? what if the external dep is stateful ?

remi-delmas-3000 · 2026-05-07T20:47:54Z

+  uninterpreted procedure; the verifier treats its outputs as arbitrary
+  values consistent with their type. This is sound but often imprecise — a
+  typical "inconclusive" result traces back to a hole.
+* **Aliasing is unsound** as noted above.


what's the path to soundness there ?

remi-delmas-3000 · 2026-05-07T20:48:33Z

+  values consistent with their type. This is sound but often imprecise — a
+  typical "inconclusive" result traces back to a hole.
+* **Aliasing is unsound** as noted above.
+* **Everything else follows from Strata's Core type system**, whose


can we elaborate on what the "everything else" means here from a python semantics perspective ? does it mean that once we've reached strata core nothing else can go wrong ? i.e. any unexpected results is necessarily a bug in the upper layers ?

…, strata-org#1144) Add section explaining why existing documentation efforts (PRs strata-org#1136 and strata-org#1144) are complementary but insufficient: they document WHAT the pipeline does but don't specify WHEN coercions fire, WHAT constitutes valid intermediate output, or HOW to arbitrate design disagreements. Positions both efforts as valuable (document as-is) while distinguishing the architecture spec (document should-be, precise enough for mechanical implementation). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Reframe as "complementary, different purpose" rather than "insufficient." Position PRs strata-org#1136/strata-org#1144 as valuable (onboarding, debugging) while distinguishing the architecture spec's prescriptive role. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

tautschnig self-assigned this May 7, 2026

tautschnig requested review from a team and Copilot May 7, 2026 10:23

github-actions Bot added the Waiting-For-Review label May 7, 2026

Copilot started reviewing on behalf of tautschnig May 7, 2026 10:24 View session

Copilot AI reviewed May 7, 2026

View reviewed changes

Comment thread docs/PythonFrontend.md Outdated

Comment thread docs/PythonFrontend.md Outdated

Comment thread docs/PythonFrontend.md Outdated

Comment thread .kiro/steering/python-frontend-debugging.md Outdated

tautschnig force-pushed the tautschnig/python-front-end-doc branch from 3c74dae to d6f474a Compare May 7, 2026 12:04

remi-delmas-3000 reviewed May 7, 2026

View reviewed changes

tautschnig changed the base branch from main to main2 May 15, 2026 21:04


		## Overview

		The Python front-end translates annotated Python programs into Strata's Core

Conversation

tautschnig commented May 7, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants