ADR-221: Implement Sample class hierarchy and ManifestStore with JSON round-trip serialisation#203
Conversation
jodavis-claude
left a comment
There was a problem hiding this comment.
Priority 1 — Correctness
-
ml/pipeline/core/manifest.py line 67:
ManifestStore.write()raisesIndexErroron an empty manifest.sample_type = _SAMPLE_TYPE_KEY[type(samples[0])]throws whenmanifest.samplesis empty.Manifest([])is a valid object (seetest_empty_manifest), sowrite()must either support it or raise a descriptiveValueError("Cannot write empty manifest")instead of an opaque index error. -
ml/pipeline/core/manifest.py line 61:
ManifestStore.read()ignores the"version"field entirely. The JSON schema explicitly carries"version": 1. Without a version check, a file written by a future incompatible schema is silently deserialized as version 1 and produces corrupt objects with no diagnostic. Addif data.get("version") != 1: raise ValueError(f"Unsupported manifest version: {data.get('version')!r}").
Priority 4 — Test coverage
- ml/test/pipeline/core/test_manifest.py: No test exercises
ManifestStore().write(Manifest([]), path). The existingtest_empty_manifestcreatesManifest([])and calls lookup methods but never callswrite(), so theIndexErrorfrom issue 1 is not caught by the test suite. Add a test underTestManifestLookup(or a dedicated class) that confirms the behaviour of writing an empty manifest.
Style (non-blocking)
- ml/test/pipeline/core/test_manifest.py line 4:
import tempfileis unused — tests use thetmp_pathpytest fixture. - ml/pipeline/core/manifest.py line 61:
_SAMPLE_TYPE_CLASS[data["sample_type"]]raises a bareKeyErroron unrecognised types. AValueError(f"Unknown sample_type: {data['sample_type']!r}")would be more informative.
Generated by Claude Code
jodavis
left a comment
There was a problem hiding this comment.
This review is focused only on the dev-team changes. Make these changes, and overall the design should be that subagents work together, with the script as the orchestrator, but they don't interact directly with GitHub or Jira. The top level agent passively watches the script's progress and makes all the updates to PRs, work items, and issues. If there's subs additional information that the top level agent needs to complete an update, we should find a way to plumb it through. But in general, the status updates front the script and the content in the context file should be enough.
There is one other thing: since the developer and reviewer can no longer communicate through PR threads, they will have to pass a structured list of comment threads back and forth, including all the comments and their responses, who they came from, and whether or not there are resolved. That structure with have to go into the context file as well, so the top level session can see them. Might need to add an ID as well, so the top level agent can map them to GitHub PR comment threads and update the right ones.
Also, there is a bug again that the review comments were added as a normal comment instead of a review. All the rules about PR comments that we added in the reviewer need to be transferred to the top level agent now.
jodavis-claude
left a comment
There was a problem hiding this comment.
No prior review threads.
Two new Priority 1 (correctness/fault tolerance) issues found in ml/pipeline/core/manifest.py:
-
manifest.py:63 — data["sample_type"] is an unguarded dict subscript. The preceding version check uses data.get("version") defensively, but sample_type access raises KeyError instead of ValueError for a malformed file. Use data.get("sample_type") and raise ValueError explicitly to match the established pattern.
-
manifest.py:70-81 — write() does not validate that all samples share the same type. Python generics are erased at runtime, so Manifest([TextSample(...), AudioSample(...)]) is constructable. If passed to write(), the method silently produces a JSON file whose sample_type header matches only the first entry; on read(), _deserialise tries to construct the wrong class for the remaining entries and fails with a confusing KeyError far from the original mistake — and the corrupt file is already on disk. Add a guard at the top of write(): if any(type(s) is not type(samples[0]) for s in samples): raise ValueError(f"Manifest contains mixed sample types: {set(type(s).name for s in samples)}").
All exit criteria are otherwise satisfied: all six dataclasses are present with fields matching spec lines 272-309; Manifest[S] with O(1) by_content_hash/by_id; ManifestStore read/write with discriminated dispatch; JSON schema version 1; all four round-trip tests plus dispatch, seed=0, parent_id, and numeric-type tests. The recent fix commits (version validation, ValueError for unknown sample_type, empty-manifest guard) are all correct additions.
Generated by Claude Code
There was a problem hiding this comment.
Pull request overview
Implements the ADR-221 Python ML data model: a Sample dataclass hierarchy (TextSample, AudioSample, SampleSpectrogram, SampleTokens) and a Manifest[S] / ManifestStore with JSON v1 round-trip serialisation, plus pytest configuration to run the new unit tests. The diff also pulls in unrelated changes: a small C# PersistSettings test fix for Linux, a switch of validate-build.sh to a Traversal-based project, and a substantial restructure of the .claude dev-team pipeline (scrum-master pattern, new prepare-pr-details skill, removal of developer-create-pr, structured review/sign-off JSON).
Changes:
- Add
ml/pipeline/core/{sample,manifest}.pywith six dataclasses and aManifestStorethat dispatches onsample_type, plus 26 round-trip tests andml/pyproject.toml. - Migrate
scripts/validate-build.shto aMicrosoft.Build.Traversalproject that filters out WPF/Console hosts on Linux, with a scopedDirectory.Build.propsthat removes inherited package references. - Refactor
.claude/dev-team workflow: top-level session handles all GitHub/Jira calls via[DEV-TEAM]milestone markers;developer-create-pris replaced byprepare-pr-details, and reviewer/sign-off skills now return structured JSON.
Reviewed changes
Copilot reviewed 19 out of 24 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| ml/pipeline/core/sample.py | Six dataclasses matching the ADR-221 spec. |
| ml/pipeline/core/manifest.py | Manifest[S] with O(1) indexes and ManifestStore JSON read/write. |
| ml/pipeline/{,core}/init.py, ml/test/.../init.py | New package markers. |
| ml/pyproject.toml | Pytest config (testpaths, pythonpath, not e2e default, marker). |
| ml/test/pipeline/core/test_manifest.py | 26 unit tests for round-trip, dispatch, lookup, empty manifest. |
| test/.../PersistSettingsTests.cs | Skip negative path expectations when %LocalAppData% doesn't expand (Linux). |
| scripts/validate-build.sh | Build the new traversal project instead of the solution. |
| scripts/validate-build.proj | Traversal project including src/test, excluding WPF/Console hosts on Linux. |
| scripts/Directory.Build.props | Strip root package references for traversal projects. |
| .claude/scripts/dev_team.py | Add pr_details/signoff_notes/base_branch context, [DEV-TEAM] milestones, new ReviewStep/SignoffStep flow. |
| .claude/commands/prepare-pr-details.md | New skill — returns PR JSON instead of calling gh. |
| .claude/commands/developer-create-pr.md | Deleted (superseded). |
| .claude/commands/reviewer-review.md | Uses git diff origin/$BASE_BRANCH..HEAD; returns body+comments+status. |
| .claude/commands/reviewer-sign-off.md | Consumes $REVIEW_NOTES; returns sign_off_body+status. |
| .claude/commands/dev-team.md | Scrum-master responsibilities and milestone-to-action table. |
| .claude/commands/debugger-investigate.md | Reads issue from context file; writes debug-comment section. |
| .claude/commands/researcher-issue.md | Adds MCP fallback when gh is unavailable. |
| .claude/agents/{developer,reviewer}.md | Skill renames; reviewer no longer posts to GitHub. |
| .claude/settings.json | Grant write permissions for .claude/logs/dev-team/* and command/agent/script dirs. |
84ed04b to
d79e446
Compare
…core/ - sample.py: Sample(ABC), SampleWithPath(Sample,ABC), TextSample, AudioSample, SampleSpectrogram, SampleTokens dataclasses with all spec fields. - manifest.py: Manifest[S] generic with O(1) by_id/by_content_hash lookups; ManifestStore.read() dispatches on sample_type; write() emits version-1 JSON. TextSample JSON omits path/parent_content_hash/applied_values; Spectrogram and Tokens JSON omits applied_values and includes parent_id per spec. - pyproject.toml: minimal pytest config so plain `pytest` from ml/ works. - Ambiguity resolutions: mutable dataclasses (no frozen=True); seed not a default field (callers pass seed=0 explicitly, matching spec's "PhraseVariator sets seed=0"). https://claude.ai/code/session_01DLVaMxWuhm75L3YaMdVTFT
1. Manifest.__init__: raise ValueError on duplicate sample ids (UUIDs must
be unique); keep first occurrence for duplicate content_hash (documented
contract — deterministic stages can produce identical hashes).
2. ManifestStore.read(): use data.get("sample_type") and raise ValueError
explicitly when the key is absent, matching the defensive pattern used
for "version".
3. ManifestStore.write(): validate that all samples share the same type
before writing; a mixed-type Manifest would produce a JSON file whose
sample_type header mismatches later entries, causing read() to KeyError.
Adds four new tests covering each guard.
https://claude.ai/code/session_01DLVaMxWuhm75L3YaMdVTFT
Tests use pytest's tmp_path fixture; tempfile was never called. https://claude.ai/code/session_01DLVaMxWuhm75L3YaMdVTFT
d79e446 to
b73a4c7
Compare
|
@copilot Address all PR review comments |
Addressed in existing commits |
TextSample now derives its id from content_hash via __post_init__, so callers no longer pass id= explicitly. This means two TextSamples with the same content always share the same id, enforcing the deduplication invariant at the type level. Update ManifestStore._deserialise to omit id= when constructing TextSample, and update tests to match: duplicate-id and duplicate-content-hash tests now use AudioSample (where id and content_hash are independent) to exercise those code paths. https://claude.ai/code/session_01DLVaMxWuhm75L3YaMdVTFT
Work item
ADR-221: Implement the
Sampleclass hierarchy andManifest[S]/ManifestStoreinml/pipeline/core/— the shared data model that every subsequent pipeline stage depends on, including JSON round-trip serialisation and the content-hash formula.Changes
ml/pipeline/__init__.py— new empty package markerml/pipeline/core/__init__.py— new empty package markerml/pipeline/core/sample.py— all six dataclasses (Sample,SampleWithPath,TextSample,AudioSample,SampleSpectrogram,SampleTokens) with every field from the spec;TextSample.idassignedstr(uuid4())ml/pipeline/core/manifest.py—Manifest[S]generic with O(1)by_id/by_content_hashindex dicts;ManifestStore.read()(type-registry dispatch onsample_type) andwrite()(version-1 JSON)ml/pyproject.toml— minimal pytest config (testpaths = ["test"],pythonpath = ["."]) enabling plainpytestinvocation fromml/ml/test/__init__.py— new empty package markerml/test/pipeline/__init__.py— new empty package markerml/test/pipeline/core/__init__.py— new empty package markerml/test/pipeline/core/test_manifest.py— 26 unit tests covering all four round-trip scenarios,sample_typedispatch,TextSample.seed == 0serialisation,SampleSpectrogram.parent_idpreservation, andAudioSample.applied_valuesnumeric-type fidelityDesign decisions
seedhas no dataclass default — the spec states "PhraseVariator explicitly setsseed=0at construction", meaning callers pass it explicitly; a default would silently mask missing arguments in later stages.frozen=False) — the spec is silent onfrozen=; downstreamModifierStageusesSampleinstances as dict values (not keys), so mutability is safe and avoids unnecessary copy overhead.pyproject.tomlcreated inml/— resolves the known ambiguity thatvalidate-buildandvalidate-testsonly cover .NET; enablespytest ml/test/pipeline/core/without manualPYTHONPATHmanipulation.Generated by Claude Code