Skip to content

fix(backend): tighten action items extraction to reduce garbage tasks#6158

Merged
mdmohsin7 merged 1 commit intomainfrom
worktree-fix-action-items-quality
Apr 8, 2026
Merged

fix(backend): tighten action items extraction to reduce garbage tasks#6158
mdmohsin7 merged 1 commit intomainfrom
worktree-fix-action-items-quality

Conversation

@mdmohsin7
Copy link
Copy Markdown
Member

@mdmohsin7 mdmohsin7 commented Mar 29, 2026

Summary

  • Revert rule 5 from permissive "Future Intent or Deadline" back to strict "NOT Already Being Done or About to Do Immediately" — stops extracting tasks for things being done right now
  • Add single-topic dedup limit (1 item per topic, not 1 per variation/detail)
  • Add real-time exchange exclusions (brief in-person conversations resolved on the spot → 0 items)
  • Strengthen implicit task filtering to default-to-nothing stance
  • Compress verbose due date section from ~44 to ~11 lines so quality rules carry more weight

Context

Users reported the system generates multiple garbage action items from casual conversations. Example: a 90-second exchange about getting water/soda from the kitchen was generating 6 items. The extraction prompt had accumulated permissive rules over time (loosened filtering, verbose date handling added in Mar b4218f796 diluting quality rules), and likely interacted with a gpt-5.1 model update to tip quality over the edge in recent weeks. This PR tightens the prompt to restore a strict quality bar.

Test plan

  • All 25 prompt caching tests pass
  • All other backend tests pass (4 pre-existing failures in unrelated desktop update tests)
  • Monitor action item quality after deploy — expect significant reduction in noise

🤖 Generated with Claude Code

…e tasks

Revert rule 5 from permissive "Future Intent or Deadline" back to strict
"NOT Already Being Done or About to Do Immediately" — stops extracting
tasks for things the user is currently doing or about to do.

Add single-topic dedup limit, real-time exchange exclusions, and stronger
implicit-task default-to-nothing stance. Compress verbose date section
from ~44 to ~11 lines so quality filtering rules carry more weight.
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 29, 2026

Greptile Summary

This PR tightens the LLM prompt used for action-item extraction from conversations, reverting a December 2025 loosening of rule 5 that caused common "I'm going to X" phrasing to generate spurious tasks (e.g., fetching water producing 6 items). The change is purely a prompt edit inside extract_action_items — no Python logic is modified.

Key changes:

  • Rule 5 reverted from "Future Intent or Deadline" → "NOT Already Being Done or About to Do Immediately", so "I'm going to X" / "I'll do X" / "Let me X" all default to SKIP.
  • New SINGLE-TOPIC LIMIT added to deduplication rules: at most 1 item per conversation topic.
  • Three new EXCLUDE bullets covering real-time in-person exchanges resolved on the spot.
  • Implicit task workflow step 3 strengthened to "default to extracting NOTHING."
  • Due-date extraction section compressed ~75% (44 → 11 lines) while preserving all logic.

Issues found:

  • The "Today I will X" → SKIP unless there's a specific time/deadline attached rule is ambiguous because "today" itself constitutes a deadline; same-day hard commitments (tax filings, payments) may be incorrectly suppressed.
  • The SINGLE-TOPIC LIMIT in the deduplication block has no carve-out for explicit user requests, potentially conflicting with the ALWAYS-EXTRACT rule that governs explicit reminders — two distinct explicit requests on a related topic could be collapsed to one.

Confidence Score: 4/5

Safe to merge with the noted prompt inconsistencies being minor — the core regression fix is sound and well-motivated.

All changes are confined to a prompt string; no Python runtime logic is altered. The two flagged issues are P2 prompt-quality gaps (ambiguous 'Today I will X' skip rule and missing explicit-request carve-out on SINGLE-TOPIC LIMIT) that could cause occasional LLM mis-decisions but will not crash or corrupt data. Score is 4 rather than 5 because the inconsistencies are in the primary decision rules of the prompt and could partially undermine the quality goals of this fix.

backend/utils/llm/conversation_processing.py — pay attention to the SINGLE-TOPIC LIMIT dedup rule and the 'Today I will X' skip rule.

Important Files Changed

Filename Overview
backend/utils/llm/conversation_processing.py Prompt-only change tightening action item extraction rules: reverts rule 5 to 'NOT already being done', adds SINGLE-TOPIC LIMIT, real-time exclusions, and compresses the due-date section; two minor prompt-logic inconsistencies noted.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Conversation text] --> B{Explicit request?\nRemind me / Add task / etc.}
    B -- Yes --> C[ALWAYS EXTRACT\nBypasses all other filters]
    B -- No --> D{Implicit task?}
    D --> E{Is user currently doing it\nor about to do it immediately?}
    E -- Yes --> F[SKIP]
    E -- No --> G{Being resolved in real-time\nbetween participants?}
    G -- Yes --> F
    G -- No --> H{Would a busy person\ngenuinely forget this?}
    H -- No --> F
    H -- Yes --> I{Passes ALL 5 strict\nfiltering rules?}
    I -- No --> F
    I -- Yes --> J{Duplicate or same topic\nas existing item?}
    J -- Yes --> F
    J -- No --> K[EXTRACT action item]
    K --> L[Parse due date → UTC timestamp]
    L --> M[Validate due_at is in future]
    M --> N[Return ActionItem]

    style C fill:#22c55e,color:#fff
    style F fill:#ef4444,color:#fff
    style N fill:#3b82f6,color:#fff
Loading

Reviews (1): Last reviewed commit: "fix(backend): tighten action items extra..." | Re-trigger Greptile

- "I'm going to X" → SKIP (about to do it right now)
- "I'll do X for you" → SKIP (immediate response to a request)
- "Let me X" → SKIP (taking action now)
- "Today I will X" → SKIP unless there's a specific time/deadline attached
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 "Today I will X → SKIP" rule is self-contradictory

The new rule reads:

"Today I will X" → SKIP unless there's a specific time/deadline attached

But "today" is a specific time/deadline. A statement like "Today I need to file my taxes" contains a concrete same-day deadline that is exactly the kind of forgettable task the system should capture. The rule as written would cause the LLM to skip it because "today" is the only time reference and isn't paired with an additional clock-time (e.g., "today by 5pm").

The previous rule explicitly called this scenario an EXTRACT: "Today, I want to complete the onboarding experience" → EXTRACT (stated goal with deadline).

Consider clarifying the intent — the goal seems to be to skip vague daily-habit statements ("Today I'll go to the gym") rather than actual deadline-anchored tasks ("Today I have to renew my insurance"). A cleaner phrasing might be:

"Today I will X" → SKIP unless it involves a hard commitment or forgettable deadline (e.g. filing, payments, submissions)

- "Call dentist" (existing) vs "Call plumber" → NOT duplicate (different person/service)
- "Submit report by March 1st" (existing) vs "Submit report by March 15th" → NOT duplicate (different deadlines)
• If you're unsure whether something is a duplicate, err on the side of treating it as a duplicate (DON'T extract)
• SINGLE-TOPIC LIMIT: If a conversation discusses one topic, extract AT MOST 1 action item for it — not one per variation, option, or detail mentioned in the discussion.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 SINGLE-TOPIC LIMIT conflicts with the ALWAYS-EXTRACT rule for explicit requests

The new dedup rule at this line says:

SINGLE-TOPIC LIMIT: If a conversation discusses one topic, extract AT MOST 1 action item for it

However, the EXPLICIT TASK/REMINDER REQUESTS section (lines 351–364) says these patterns ALWAYS extract regardless of other filters:

"Put X on my list" / "Add X to my tasks" → EXTRACT "X"

If a user explicitly asks for two reminders on the same general subject — e.g., "Remind me to call the dentist tomorrow" and "Add a task to pick up my prescription on Friday" — both are health-related and could be collapsed to one item by the SINGLE-TOPIC LIMIT, despite being distinct explicit requests.

The SINGLE-TOPIC LIMIT needs a carve-out (mirroring the NOTE: Skip this requirement if user explicitly asked for a reminder/task pattern used in rules 3 and 4):

Suggested change
SINGLE-TOPIC LIMIT: If a conversation discusses one topic, extract AT MOST 1 action item for itnot one per variation, option, or detail mentioned in the discussion.
SINGLE-TOPIC LIMIT: If a conversation discusses one topic, extract AT MOST 1 action item for itnot one per variation, option, or detail mentioned in the discussion. NOTE: Skip this limit if each item was explicitly requested by the user.

@mdmohsin7 mdmohsin7 merged commit 61fe80d into main Apr 8, 2026
3 checks passed
@mdmohsin7 mdmohsin7 deleted the worktree-fix-action-items-quality branch April 8, 2026 12:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant