ADR-221: Implement Sample class hierarchy and ManifestStore with JSON round-trip serialisation by jodavis-claude · Pull Request #203 · jodavis/AdaptiveRemote

jodavis-claude · 2026-05-30T01:00:06Z

Work item

ADR-221: Implement the Sample class hierarchy and Manifest[S] / ManifestStore in ml/pipeline/core/ — the shared data model that every subsequent pipeline stage depends on, including JSON round-trip serialisation and the content-hash formula.

Changes

ml/pipeline/__init__.py — new empty package marker
ml/pipeline/core/__init__.py — new empty package marker
ml/pipeline/core/sample.py — all six dataclasses (Sample, SampleWithPath, TextSample, AudioSample, SampleSpectrogram, SampleTokens) with every field from the spec; TextSample.id assigned str(uuid4())
ml/pipeline/core/manifest.py — Manifest[S] generic with O(1) by_id/by_content_hash index dicts; ManifestStore.read() (type-registry dispatch on sample_type) and write() (version-1 JSON)
ml/pyproject.toml — minimal pytest config (testpaths = ["test"], pythonpath = ["."]) enabling plain pytest invocation from ml/
ml/test/__init__.py — new empty package marker
ml/test/pipeline/__init__.py — new empty package marker
ml/test/pipeline/core/__init__.py — new empty package marker
ml/test/pipeline/core/test_manifest.py — 26 unit tests covering all four round-trip scenarios, sample_type dispatch, TextSample.seed == 0 serialisation, SampleSpectrogram.parent_id preservation, and AudioSample.applied_values numeric-type fidelity

Design decisions

seed has no dataclass default — the spec states "PhraseVariator explicitly sets seed=0 at construction", meaning callers pass it explicitly; a default would silently mask missing arguments in later stages.
Mutable dataclasses (frozen=False) — the spec is silent on frozen=; downstream ModifierStage uses Sample instances as dict values (not keys), so mutability is safe and avoids unnecessary copy overhead.
pyproject.toml created in ml/ — resolves the known ambiguity that validate-build and validate-tests only cover .NET; enables pytest ml/test/pipeline/core/ without manual PYTHONPATH manipulation.

Generated by Claude Code

github-actions · 2026-05-30T01:05:12Z

Test Results

401 tests ±0 401 ✅ ±0 2m 5s ⏱️ -35s
5 suites ±0 0 💤 ±0
5 files ±0 0 ❌ ±0

Results for commit 091b5ca. ± Comparison against base commit 2b4bc34.

♻️ This comment has been updated with latest results.

jodavis-claude

Priority 1 — Correctness

ml/pipeline/core/manifest.py line 67: ManifestStore.write() raises IndexError on an empty manifest. sample_type = _SAMPLE_TYPE_KEY[type(samples[0])] throws when manifest.samples is empty. Manifest([]) is a valid object (see test_empty_manifest), so write() must either support it or raise a descriptive ValueError("Cannot write empty manifest") instead of an opaque index error.
ml/pipeline/core/manifest.py line 61: ManifestStore.read() ignores the "version" field entirely. The JSON schema explicitly carries "version": 1. Without a version check, a file written by a future incompatible schema is silently deserialized as version 1 and produces corrupt objects with no diagnostic. Add if data.get("version") != 1: raise ValueError(f"Unsupported manifest version: {data.get('version')!r}").

Priority 4 — Test coverage

ml/test/pipeline/core/test_manifest.py: No test exercises ManifestStore().write(Manifest([]), path). The existing test_empty_manifest creates Manifest([]) and calls lookup methods but never calls write(), so the IndexError from issue 1 is not caught by the test suite. Add a test under TestManifestLookup (or a dedicated class) that confirms the behaviour of writing an empty manifest.

Style (non-blocking)

ml/test/pipeline/core/test_manifest.py line 4: import tempfile is unused — tests use the tmp_path pytest fixture.
ml/pipeline/core/manifest.py line 61: _SAMPLE_TYPE_CLASS[data["sample_type"]] raises a bare KeyError on unrecognised types. A ValueError(f"Unknown sample_type: {data['sample_type']!r}") would be more informative.

Generated by Claude Code

jodavis

This review is focused only on the dev-team changes. Make these changes, and overall the design should be that subagents work together, with the script as the orchestrator, but they don't interact directly with GitHub or Jira. The top level agent passively watches the script's progress and makes all the updates to PRs, work items, and issues. If there's subs additional information that the top level agent needs to complete an update, we should find a way to plumb it through. But in general, the status updates front the script and the content in the context file should be enough.

There is one other thing: since the developer and reviewer can no longer communicate through PR threads, they will have to pass a structured list of comment threads back and forth, including all the comments and their responses, who they came from, and whether or not there are resolved. That structure with have to go into the context file as well, so the top level session can see them. Might need to add an ID as well, so the top level agent can map them to GitHub PR comment threads and update the right ones.

Also, there is a bug again that the review comments were added as a normal comment instead of a review. All the rules about PR comments that we added in the reviewer need to be transferred to the top level agent now.

jodavis-claude

No prior review threads.

Two new Priority 1 (correctness/fault tolerance) issues found in ml/pipeline/core/manifest.py:

manifest.py:63 — data["sample_type"] is an unguarded dict subscript. The preceding version check uses data.get("version") defensively, but sample_type access raises KeyError instead of ValueError for a malformed file. Use data.get("sample_type") and raise ValueError explicitly to match the established pattern.
manifest.py:70-81 — write() does not validate that all samples share the same type. Python generics are erased at runtime, so Manifest([TextSample(...), AudioSample(...)]) is constructable. If passed to write(), the method silently produces a JSON file whose sample_type header matches only the first entry; on read(), _deserialise tries to construct the wrong class for the remaining entries and fails with a confusing KeyError far from the original mistake — and the corrupt file is already on disk. Add a guard at the top of write(): if any(type(s) is not type(samples[0]) for s in samples): raise ValueError(f"Manifest contains mixed sample types: {set(type(s).name for s in samples)}").

All exit criteria are otherwise satisfied: all six dataclasses are present with fields matching spec lines 272-309; Manifest[S] with O(1) by_content_hash/by_id; ManifestStore read/write with discriminated dispatch; JSON schema version 1; all four round-trip tests plus dispatch, seed=0, parent_id, and numeric-type tests. The recent fix commits (version validation, ValueError for unknown sample_type, empty-manifest guard) are all correct additions.

Generated by Claude Code

Copilot

Pull request overview

Implements the ADR-221 Python ML data model: a Sample dataclass hierarchy (TextSample, AudioSample, SampleSpectrogram, SampleTokens) and a Manifest[S] / ManifestStore with JSON v1 round-trip serialisation, plus pytest configuration to run the new unit tests. The diff also pulls in unrelated changes: a small C# PersistSettings test fix for Linux, a switch of validate-build.sh to a Traversal-based project, and a substantial restructure of the .claude dev-team pipeline (scrum-master pattern, new prepare-pr-details skill, removal of developer-create-pr, structured review/sign-off JSON).

Changes:

Add ml/pipeline/core/{sample,manifest}.py with six dataclasses and a ManifestStore that dispatches on sample_type, plus 26 round-trip tests and ml/pyproject.toml.
Migrate scripts/validate-build.sh to a Microsoft.Build.Traversal project that filters out WPF/Console hosts on Linux, with a scoped Directory.Build.props that removes inherited package references.
Refactor .claude/ dev-team workflow: top-level session handles all GitHub/Jira calls via [DEV-TEAM] milestone markers; developer-create-pr is replaced by prepare-pr-details, and reviewer/sign-off skills now return structured JSON.

Reviewed changes

Copilot reviewed 19 out of 24 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
ml/pipeline/core/sample.py	Six dataclasses matching the ADR-221 spec.
ml/pipeline/core/manifest.py	`Manifest[S]` with O(1) indexes and `ManifestStore` JSON read/write.
ml/pipeline/{,core}/init.py, ml/test/.../init.py	New package markers.
ml/pyproject.toml	Pytest config (testpaths, pythonpath, `not e2e` default, marker).
ml/test/pipeline/core/test_manifest.py	26 unit tests for round-trip, dispatch, lookup, empty manifest.
test/.../PersistSettingsTests.cs	Skip negative path expectations when `%LocalAppData%` doesn't expand (Linux).
scripts/validate-build.sh	Build the new traversal project instead of the solution.
scripts/validate-build.proj	Traversal project including src/test, excluding WPF/Console hosts on Linux.
scripts/Directory.Build.props	Strip root package references for traversal projects.
.claude/scripts/dev_team.py	Add `pr_details`/`signoff_notes`/`base_branch` context, `[DEV-TEAM]` milestones, new ReviewStep/SignoffStep flow.
.claude/commands/prepare-pr-details.md	New skill — returns PR JSON instead of calling `gh`.
.claude/commands/developer-create-pr.md	Deleted (superseded).
.claude/commands/reviewer-review.md	Uses `git diff origin/$BASE_BRANCH..HEAD`; returns `body`+`comments`+`status`.
.claude/commands/reviewer-sign-off.md	Consumes `$REVIEW_NOTES`; returns `sign_off_body`+`status`.
.claude/commands/dev-team.md	Scrum-master responsibilities and milestone-to-action table.
.claude/commands/debugger-investigate.md	Reads issue from context file; writes debug-comment section.
.claude/commands/researcher-issue.md	Adds MCP fallback when `gh` is unavailable.
.claude/agents/{developer,reviewer}.md	Skill renames; reviewer no longer posts to GitHub.
.claude/settings.json	Grant write permissions for `.claude/logs/dev-team/*` and command/agent/script dirs.

…core/ - sample.py: Sample(ABC), SampleWithPath(Sample,ABC), TextSample, AudioSample, SampleSpectrogram, SampleTokens dataclasses with all spec fields. - manifest.py: Manifest[S] generic with O(1) by_id/by_content_hash lookups; ManifestStore.read() dispatches on sample_type; write() emits version-1 JSON. TextSample JSON omits path/parent_content_hash/applied_values; Spectrogram and Tokens JSON omits applied_values and includes parent_id per spec. - pyproject.toml: minimal pytest config so plain `pytest` from ml/ works. - Ambiguity resolutions: mutable dataclasses (no frozen=True); seed not a default field (callers pass seed=0 explicitly, matching spec's "PhraseVariator sets seed=0"). https://claude.ai/code/session_01DLVaMxWuhm75L3YaMdVTFT

1. Manifest.__init__: raise ValueError on duplicate sample ids (UUIDs must be unique); keep first occurrence for duplicate content_hash (documented contract — deterministic stages can produce identical hashes). 2. ManifestStore.read(): use data.get("sample_type") and raise ValueError explicitly when the key is absent, matching the defensive pattern used for "version". 3. ManifestStore.write(): validate that all samples share the same type before writing; a mixed-type Manifest would produce a JSON file whose sample_type header mismatches later entries, causing read() to KeyError. Adds four new tests covering each guard. https://claude.ai/code/session_01DLVaMxWuhm75L3YaMdVTFT

Tests use pytest's tmp_path fixture; tempfile was never called. https://claude.ai/code/session_01DLVaMxWuhm75L3YaMdVTFT

jodavis · 2026-05-30T20:56:30Z

@copilot Address all PR review comments

Copilot · 2026-05-30T21:00:19Z

@copilot Address all PR review comments

Addressed in existing commits b8b7c67 and b73a4c7. These include the manifest read/write validation hardening (version/sample_type/empty/mixed-type guards), improved error behavior, added coverage for empty write + validation edge cases, and test cleanup.

TextSample now derives its id from content_hash via __post_init__, so callers no longer pass id= explicitly. This means two TextSamples with the same content always share the same id, enforcing the deduplication invariant at the type level. Update ManifestStore._deserialise to omit id= when constructing TextSample, and update tests to match: duplicate-id and duplicate-content-hash tests now use AudioSample (where id and content_hash are independent) to exercise those code paths. https://claude.ai/code/session_01DLVaMxWuhm75L3YaMdVTFT

jodavis-claude commented May 30, 2026

View reviewed changes

jodavis requested changes May 30, 2026

View reviewed changes

jodavis-claude commented May 30, 2026

View reviewed changes

jodavis requested a review from Copilot May 30, 2026 18:39

Copilot started reviewing on behalf of jodavis May 30, 2026 18:39 View session

Copilot AI reviewed May 30, 2026

View reviewed changes

Comment thread ml/pipeline/core/manifest.py

Comment thread ml/pipeline/core/manifest.py

jodavis marked this pull request as ready for review May 30, 2026 19:34

jodavis force-pushed the dev/claude/ADR-221-implement-the-sample-class-hierarchy-and branch from 84ed04b to d79e446 Compare May 30, 2026 20:21

claude added 3 commits May 30, 2026 13:23

ADR-221: remove unused import tempfile from test_manifest.py

b73a4c7

Tests use pytest's tmp_path fixture; tempfile was never called. https://claude.ai/code/session_01DLVaMxWuhm75L3YaMdVTFT

jodavis force-pushed the dev/claude/ADR-221-implement-the-sample-class-hierarchy-and branch from d79e446 to b73a4c7 Compare May 30, 2026 20:23

jodavis requested changes May 30, 2026

View reviewed changes

Comment thread ml/pipeline/core/sample.py

Copilot started work on behalf of jodavis May 30, 2026 20:56 View session

jodavis enabled auto-merge May 30, 2026 20:56

Copilot finished work on behalf of jodavis May 30, 2026 21:00

Copilot AI requested a review from jodavis May 30, 2026 21:00

jodavis approved these changes May 30, 2026

View reviewed changes

jodavis merged commit b835493 into feature/ADR-191-oop-ml-pipeline May 30, 2026
2 checks passed

jodavis deleted the dev/claude/ADR-221-implement-the-sample-class-hierarchy-and branch May 30, 2026 21:09

Conversation

jodavis-claude commented May 30, 2026

Work item

Changes

Design decisions

Uh oh!

github-actions Bot commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Results

Uh oh!

jodavis-claude left a comment

Choose a reason for hiding this comment

Uh oh!

jodavis left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jodavis-claude left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jodavis commented May 30, 2026

Uh oh!

Copilot AI commented May 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

github-actions Bot commented May 30, 2026 •

edited

Loading