Skip to content

Add failing tests for #661: cross-cycle failure memory#672

Draft
prompt-driven-github[bot] wants to merge 1 commit intomainfrom
fix/issue-661
Draft

Add failing tests for #661: cross-cycle failure memory#672
prompt-driven-github[bot] wants to merge 1 commit intomainfrom
fix/issue-661

Conversation

@prompt-driven-github
Copy link
Copy Markdown
Contributor

Summary

Adds failing tests that detect the bug reported in #661: the pdd fix orchestrator has no memory of step outcomes across cycles, causing deterministic failures to be retried identically.

Test Files

  • Unit tests: tests/test_agentic_e2e_fix_orchestrator.py — 6 bug-detecting tests + 1 regression test in TestCrossCycleFailureMemory class
  • E2E tests: tests/test_e2e_issue_661_cross_cycle_failure_memory.py — 4 bug-detecting tests + 1 regression test

What This PR Contains

  • Failing unit tests that reproduce the reported bug through 4 channels:
    1. step_outputs cleared at cycle boundary (line 920)
    2. state_data["step_outputs"] cleared in persisted state (line 924)
    3. No prior-cycle failure data in prompt context dict (lines 737-764)
    4. Failure history lost on resume from completed cycle (line 696)
  • Failing E2E tests that verify the bug through the full orchestrator pipeline with real prompt templates
  • All bug-detecting tests are verified to fail on current code and will pass once the bug is fixed
  • Regression tests pass, confirming the happy path is unaffected

Root Cause

The orchestrator clears step_outputs = {} at 3 locations (lines 696, 920, 924 of agentic_e2e_fix_orchestrator.py), erasing all knowledge of what failed and why at every cycle boundary. Additionally, the prompt context dict (lines 737-764) never includes prior-cycle failure data, so even if failures were preserved, the LLM wouldn't see them.

Next Steps

  1. Implement the fix: accumulate failure history before clearing step_outputs, inject into prompt context, add skip logic for deterministically failed steps
  2. Verify unit tests pass
  3. Verify E2E tests pass
  4. Run full test suite
  5. Mark PR as ready for review

Fixes #661


Generated by PDD agentic bug workflow

Unit tests (6 bug-detecting + 1 regression) and E2E tests (4 bug-detecting
+ 1 regression) that verify the orchestrator should preserve step failure
history across cycles. All bug-detecting tests correctly fail on current
code, confirming the bug at lines 696, 920, 924, and 737-764 of
agentic_e2e_fix_orchestrator.py.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

pdd fix: workflow retries deterministically failed steps across cycles, wasting cost and time

1 participant