Add failing tests for #661: cross-cycle failure memory by prompt-driven-github[bot] · Pull Request #672 · promptdriven/pdd

prompt-driven-github · 2026-03-11T03:42:58Z

Summary

Adds failing tests that detect the bug reported in #661: the pdd fix orchestrator has no memory of step outcomes across cycles, causing deterministic failures to be retried identically.

Test Files

Unit tests: tests/test_agentic_e2e_fix_orchestrator.py — 6 bug-detecting tests + 1 regression test in TestCrossCycleFailureMemory class
E2E tests: tests/test_e2e_issue_661_cross_cycle_failure_memory.py — 4 bug-detecting tests + 1 regression test

What This PR Contains

Failing unit tests that reproduce the reported bug through 4 channels:
1. step_outputs cleared at cycle boundary (line 920)
2. state_data["step_outputs"] cleared in persisted state (line 924)
3. No prior-cycle failure data in prompt context dict (lines 737-764)
4. Failure history lost on resume from completed cycle (line 696)
Failing E2E tests that verify the bug through the full orchestrator pipeline with real prompt templates
All bug-detecting tests are verified to fail on current code and will pass once the bug is fixed
Regression tests pass, confirming the happy path is unaffected

Root Cause

The orchestrator clears step_outputs = {} at 3 locations (lines 696, 920, 924 of agentic_e2e_fix_orchestrator.py), erasing all knowledge of what failed and why at every cycle boundary. Additionally, the prompt context dict (lines 737-764) never includes prior-cycle failure data, so even if failures were preserved, the LLM wouldn't see them.

Next Steps

Implement the fix: accumulate failure history before clearing step_outputs, inject into prompt context, add skip logic for deterministically failed steps
Verify unit tests pass
Verify E2E tests pass
Run full test suite
Mark PR as ready for review

Fixes #661

Generated by PDD agentic bug workflow

Unit tests (6 bug-detecting + 1 regression) and E2E tests (4 bug-detecting + 1 regression) that verify the orchestrator should preserve step failure history across cycles. All bug-detecting tests correctly fail on current code, confirming the bug at lines 696, 920, 924, and 737-764 of agentic_e2e_fix_orchestrator.py. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

prompt-driven-github bot mentioned this pull request Mar 11, 2026

pdd fix: workflow retries deterministically failed steps across cycles, wasting cost and time #661

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add failing tests for #661: cross-cycle failure memory#672

Add failing tests for #661: cross-cycle failure memory#672
prompt-driven-github[bot] wants to merge 1 commit intomainfrom
fix/issue-661

prompt-driven-github bot commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

prompt-driven-github bot commented Mar 11, 2026

Summary

Test Files

What This PR Contains

Root Cause

Next Steps

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant