Add failing tests for #661: cross-cycle failure memory#672
Draft
prompt-driven-github[bot] wants to merge 1 commit intomainfrom
Draft
Add failing tests for #661: cross-cycle failure memory#672prompt-driven-github[bot] wants to merge 1 commit intomainfrom
prompt-driven-github[bot] wants to merge 1 commit intomainfrom
Conversation
Unit tests (6 bug-detecting + 1 regression) and E2E tests (4 bug-detecting + 1 regression) that verify the orchestrator should preserve step failure history across cycles. All bug-detecting tests correctly fail on current code, confirming the bug at lines 696, 920, 924, and 737-764 of agentic_e2e_fix_orchestrator.py. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds failing tests that detect the bug reported in #661: the
pdd fixorchestrator has no memory of step outcomes across cycles, causing deterministic failures to be retried identically.Test Files
tests/test_agentic_e2e_fix_orchestrator.py— 6 bug-detecting tests + 1 regression test inTestCrossCycleFailureMemoryclasstests/test_e2e_issue_661_cross_cycle_failure_memory.py— 4 bug-detecting tests + 1 regression testWhat This PR Contains
step_outputscleared at cycle boundary (line 920)state_data["step_outputs"]cleared in persisted state (line 924)Root Cause
The orchestrator clears
step_outputs = {}at 3 locations (lines 696, 920, 924 ofagentic_e2e_fix_orchestrator.py), erasing all knowledge of what failed and why at every cycle boundary. Additionally, the prompt context dict (lines 737-764) never includes prior-cycle failure data, so even if failures were preserved, the LLM wouldn't see them.Next Steps
step_outputs, inject into prompt context, add skip logic for deterministically failed stepsFixes #661
Generated by PDD agentic bug workflow