OUT-3761: should_retry column for terminal sync log failures#255
Conversation
…ailures ACCOUNT/AUTH webhook failures used to set deletedAt=now() on the live claim row, which pulled it out of uq_qb_sync_logs_oneshot_active (WHERE deleted_at IS NULL) and let subsequent webhook deliveries re-claim, re-fail, and duplicate FAILED rows + downstream notifications. Replace the deletedAt-at-insert hack with an explicit should_retry boolean column. Terminal failures keep deletedAt=null so the partial unique index still dedupes repeat claims; the cron retry sweep filters shouldRetry=true to skip terminal categories. AUTH is terminal (revived only on OAuth reconnect via includeNoRetry); ACCOUNT stays retryable so the next cron sweep self-heals once the customer renews their QBO subscription. Also replaces getAllFailedLogsForWorkspace's includeDeleted parameter with includeNoRetry; soft-deleted rows are now permanently excluded from the retry sweep. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR replaces the
Confidence Score: 4/5Safe to merge; fixes a real deduplication bug without breaking existing retry or notification flows The core fix is sound: keeping
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Webhook arrives] --> B[claimWebhookEvent\nINSERT ON CONFLICT DO NOTHING]
B -->|claimed=false\nrow already exists| C[Return 200 — deduplicated]
B -->|claimed=true\nPENDING row inserted| D[Execute sync handler]
D -->|success| E[Update row to SUCCESS\ndeletedAt=null]
D -->|error| F[getShouldRetryForCategory]
F -->|AUTH category\nrefresh token dead| G[shouldRetry=false\ndeletedAt=null\nUnique index still holds slot]
F -->|ACCOUNT / RATE_LIMIT / OTHERS / etc.| H[shouldRetry=true\ndeletedAt=null\nUnique index still holds slot]
G --> I[Normal cron sweep\nexcludes AUTH row]
H --> J[Normal cron sweep\nretries row up to MAX_ATTEMPTS=25]
I --> K[OAuth reconnect\nsyncFailedRecords includeNoRetry=true\nIncludes AUTH rows]
K --> D
Reviews (1): Last reviewed commit: "feat(OUT-3761): add should_retry column ..." | Re-trigger Greptile |
ACCOUNT failures are now retryable (kept in qb_sync_logs with shouldRetry=true); with MAX_ATTEMPTS=25 a 3h cadence burned the attempt counter in ~75h (~3 days), often before a customer renewed their QBO subscription. 12h × 25 = ~300h (~12.5 days) of retry headroom, which matches the realistic recovery window without raising MAX_ATTEMPTS. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
priosshrsth
left a comment
There was a problem hiding this comment.
@SandipBajracharya Minor nitpick reagrding parameter name. Otherwise the PR looks good to me.
Summary
deletedAt = now()hack on ACCOUNT-category failures with an explicitshould_retryboolean. The hack brokeuq_qb_sync_logs_oneshot_active(filteredWHERE deleted_at IS NULL) and let every webhook retry re-claim, producing duplicate FAILED rows + duplicate downstream notifications.deletedAt = nullso the partial unique index keeps deduping. AUTH stays terminal (shouldRetry = false, revived on OAuth reconnect viaincludeNoRetry: true). ACCOUNT stays retryable so the next cron sweep self-heals once the customer renews their QBO subscription (MAX_ATTEMPTS = 25bounds the waste).getAllFailedLogsForWorkspace'sincludeDeletedparameter is replaced withincludeNoRetry; soft-deleted rows are now permanently excluded from the retry sweep.Ticket
https://linear.app/assemblycom/issue/OUT-3761/duplicate-quickbooks-notification-are-dispatched-for-sync-failed
Test plan
getShouldRetryForCategory(9 cases covering allFailedRecordCategoryTypebranches + the undefined path).accountFailureClaimDedup.test.ts) — one FAILED row withshouldRetry = true, deletedAt = null, repeat webhook delivery returnsclaimed: false.authFailureClaimDedup.test.ts) — one FAILED row withshouldRetry = false, deletedAt = null, repeat webhook delivery returnsclaimed: false.getAllFailedLogsForWorkspace(includeNoRetry)(retrySweepIncludeNoRetry.test.ts) — default skips terminal rows;trueincludes them.yarn test— 36 files / 269 tests passing.🤖 Generated with Claude Code