Skip to content

Add stop population/update feature#107

Merged
simantak-dabhade merged 12 commits into
mainfrom
feature/stop-population
Jun 2, 2026
Merged

Add stop population/update feature#107
simantak-dabhade merged 12 commits into
mainfrom
feature/stop-population

Conversation

@MMeteorL
Copy link
Copy Markdown
Collaborator

Summary

Adds a Stop button in the Settings dropdown that cancels an in-flight populate or update run. Already-collected rows are kept; the dataset transitions immediately to live and the ready email is sent with the row count collected so far.

How it works

Signal threading

The Vercel AI SDK (and the underlying fetch to OpenRouter) only honours cancellation if an AbortSignal is passed explicitly to the call — controller.abort() does not propagate automatically through async call chains. The signal is threaded via abortSignal to every agent.generate() call:

  • Orchestrator (agentStep in populate.ts) — first place a stop fires during population
  • run_subagent tool (investigate-tool.ts) — re-throws AbortError so it propagates back up through the tool call to the orchestrator, not swallowed as a structured failure
  • Refresh agents (refreshRowsStep in update.ts) — each processRow re-throws AbortError so the worker exits; Promise.allSettled winds all workers down

Stop during populate

agentStep throws AbortError → Mastra marks the step failed → run.start() returns { status: "failed" } → background runner catches it, checks controller.signal.aborted, calls setDatasetPopulateStatus("live") + sendDatasetReadyNotification (skipped if 0 rows).

Stop during update (different path)

processRow re-throws AbortError → workers exit early via Promise.allSettled (no throw) → refreshRowsStep detects signal.aborted after processWithConcurrency returns, calls the new clearAllPendingUpdateStatus Convex mutation to reset any rows that were never reached → step returns normally → Mastra sees success → the existing success path sets "live" and sends the email. This is why the update runner has no abort catch branch — it isn't needed.

Abort registry (abort-registry.ts)

A module-level Map<datasetId, AbortController>. Keyed by datasetId because:

  • /stop knows the datasetId from the request body
  • Convex's atomic claim guarantees at most one active run per dataset (so datasetId is unique)
  • Workflow steps already have authorizedDatasetId/datasetId in their inputData — no extra lookup field needed

This eliminates the need for a second activeDatasets: Map<datasetId, runId> bridging map; the /stop route calls abortDataset(datasetId) directly.

finaliseRunAsLive() helper

Both background runners share the same stop-success sequence: query the dataset, set status "live", count rows, send the ready email. Extracted into a single helper to avoid duplication.

runStats on stop

The existing finally block in agentStep saves runStats regardless of how the step exits — a stopped populate run records status: "error" with error: "Stopped by user" so the run is visible in metrics without polluting the failure count.

Files changed

File Change
backend/src/abort-registry.ts New — Map<datasetId, AbortController> with registerDataset, getSignal, abortDataset, deregisterDataset
backend/src/index.ts Register/deregister runs; finaliseRunAsLive() helper; /stop route; populate abort catch; update runner deregistration
backend/src/mastra/workflows/populate.ts Pass abortSignal to agent.generate(); label AbortError as "Stopped by user" in runStats
backend/src/mastra/workflows/update.ts Pass abortSignal to agent.generate(); re-throw AbortError in processRow; bulk-clear pending rows post-abort
backend/src/mastra/tools/investigate-tool.ts Pass abortSignal to sub-agent; re-throw AbortError to propagate cancellation
frontend/convex/datasetRows.ts New clearAllPendingUpdateStatus internal mutation — bulk-clears updateStatus: "pending" rows
frontend/lib/backend.ts New stopPopulation(datasetId, token)POST /stop
frontend/lib/analytics.ts Add DATASET_STOP_REQUESTED event
frontend/app/dataset/[id]/page.tsx handleStop, stopping state; SettingsDropdown swaps Update/Populate for amber Stop button while dataset is busy

Test plan

  • Start a populate run, open Settings → Stop button is visible (amber, square-stop icon)
  • Click Stop — button shows "Stopping…" briefly, dataset transitions from building to live, existing rows are preserved
  • Start an update run, click Stop — pending shimmers clear, dataset transitions to live, existing row data unchanged
  • Stop with 0 rows collected — dataset goes live with no email sent
  • After a natural completion, Settings shows Update + Clear & Populate (not Stop)
  • Run make convex-push to deploy clearAllPendingUpdateStatus before testing the update stop path

🤖 Generated with Claude Code

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 30, 2026

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds user-initiated stopping for dataset build/update runs: an in-process AbortController registry, wiring of register/abort/deregister into populate/update background runners, propagation of dataset-scoped AbortSignals into agent/subagent calls with AbortError rethrowing, a protected POST /stop endpoint, frontend Stop UI and client, analytics event, and a Convex mutation and schema index to clear pending updateStatus after aborted updates.

Sequence Diagram(s)

sequenceDiagram
  participant User
  participant Frontend
  participant BackendAPI
  participant AbortRegistry
  participant Worker as Populate/Update Worker
  participant Agent
  participant ConvexDB

  User->>Frontend: clicks "Stop"
  Frontend->>BackendAPI: POST /stop (datasetId, Bearer token)
  BackendAPI->>AbortRegistry: abortDataset(datasetId)
  AbortRegistry-->>BackendAPI: abort triggered / no-op
  BackendAPI-->>Frontend: 202 or 200
  par background worker flow
    Worker->>AbortRegistry: registerDataset(datasetId)
    AbortRegistry->>Worker: AbortSignal
    Worker->>Agent: agent.generate(..., abortSignal)
    Agent-->>Worker: throws AbortError
    Worker->>ConvexDB: finaliseRunAsLive / mark failed / clear pending statuses
    Worker->>AbortRegistry: deregisterDataset(datasetId)
  end
Loading

Possibly related PRs

  • tinyfish-io/bigset#104: Extends the update background/refresh workflow; overlaps with abort/pending-status handling.
  • tinyfish-io/bigset#81: Modifies the same investigate-tool used by this PR (agent/subagent invocation changes).
  • tinyfish-io/bigset#85: Prior populate/back-end orchestration edits that interact with the populate-agent error paths modified here.

Suggested reviewers

  • simantak-dabhade
  • giaphutran12
🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Add stop population/update feature' clearly and concisely summarizes the main change: introducing a Stop button to cancel in-flight populate or update runs.
Description check ✅ Passed The description comprehensively explains the feature, implementation approach (signal threading, different paths for populate vs update), abort registry design, API changes, and includes a detailed test plan aligned with the changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/stop-population

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (1)
frontend/lib/backend.ts (1)

115-135: ⚡ Quick win

Route this through the shared backend fetch helper.

This adds another copy of the same fetch/error-parsing/auth-header logic already repeated in this file. Please extract or reuse a single typed request helper here so response-shape and error-handling changes don't drift per endpoint.

As per coding guidelines, "Implement typed fetch wrapper in lib/backend.ts for calling the Fastify backend with Bearer token authentication".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@frontend/lib/backend.ts` around lines 115 - 135, stopPopulation duplicates
fetch/auth/error-parsing logic; replace its direct fetch with the shared typed
fetch wrapper (implement or reuse a helper such as backendFetch/requestWithAuth
that accepts path, method, body, token and returns typed JSON or throws parsed
errors). Modify stopPopulation to call that helper (passing "/stop", method
"POST", { datasetId } and token) and return the typed result, removing the
inline headers, JSON.stringify, and manual res.ok handling so all endpoints
share single error/response handling.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/src/index.ts`:
- Around line 360-368: When handling controller.signal.aborted and calling
finaliseRunAsLive({ logger, clerk, datasetId, authorizedUserId }), do not
unconditionally return before ensuring the controller registry remains available
if finalisation fails: add a boolean flag (e.g., finaliseSucceeded) set to true
only after finaliseRunAsLive completes successfully, and change the finally
block that currently deregisters the controller so it only removes the
controller when finaliseSucceeded is true (or when the run truly finished),
leaving the /stop registry entry in place if finalisation failed; reference
controller.signal.aborted, finaliseRunAsLive, and the finally block that
deregisters the controller to locate where to add the flag and conditional
deregistration.

In `@backend/src/mastra/tools/investigate-tool.ts`:
- Around line 116-117: The call to agent.generate currently caps subagents at 10
steps; change the maxSteps option to 25 to match the investigate tool contract:
replace agent.generate(prompt, { abortSignal, maxSteps: 10 }) with maxSteps: 25,
and ensure any usage that spawns investigate subagents via
buildInvestigateTool(authorizedDatasetId, authContext, columns) also passes/uses
the 25-step budget and continues to parse the tool's structured output as
expected (refer to getSignal and agent.generate to locate the call to update).

In `@backend/src/mastra/workflows/populate.ts`:
- Around line 237-238: The cancelation is ineffective because the AbortSignal
from getSignal(inputData.authorizedDatasetId) is only passed into agent.generate
at the end; change enumerateStep (and any internal call generateText) to accept
an abortSignal parameter and forward it into agent.generate (or the lower-level
generateText call) so the in-progress text generation can be aborted
immediately; additionally, have enumerateStep check abortSignal.aborted at
sensible points and throw or return early when aborted to ensure the dataset
transitions to live promptly.

In `@frontend/app/dataset/`[id]/page.tsx:
- Around line 173-194: The Stop action un-latches too early because handleStop
clears stopping in the finally block even though the dataset may still be in
"building"/"updating"; change handleStop (and the similar code around lines
212-213) to keep stopping true until the dataset actually leaves the busy
states: setStopping(true) before calling stopPopulation, remove
setStopping(false) from the immediate finally, and instead clear stopping only
after you observe dataset.status !== "building" && dataset.status !== "updating"
(e.g. subscribe to the dataset state update, poll the dataset status once or
listen for the status change event) so the button remains disabled during the
abort/cleanup window and duplicate stop requests/events are prevented.

In `@frontend/convex/datasetRows.ts`:
- Around line 315-330: The mutation clearAllPendingUpdateStatus currently calls
ctx.db.query("datasetRows").withIndex("by_dataset").collect() and patches every
pending row in one run, risking Convex transaction limits; change it to process
rows in bounded batches using pagination (e.g., use the query
withIndex(...).paginate or iterate with a limit and resume token) and patch only
each page's rows, or split the work into repeated smaller internalMutations
until no pending rows remain, while still returning the total cleared; update
references to the query call (datasetRows by_dataset), the collect() usage, and
the ctx.db.patch(row._id, ...) logic to operate per-page with a safe limit and
resume token.

---

Nitpick comments:
In `@frontend/lib/backend.ts`:
- Around line 115-135: stopPopulation duplicates fetch/auth/error-parsing logic;
replace its direct fetch with the shared typed fetch wrapper (implement or reuse
a helper such as backendFetch/requestWithAuth that accepts path, method, body,
token and returns typed JSON or throws parsed errors). Modify stopPopulation to
call that helper (passing "/stop", method "POST", { datasetId } and token) and
return the typed result, removing the inline headers, JSON.stringify, and manual
res.ok handling so all endpoints share single error/response handling.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f093be88-b2b9-4690-a1d3-ac2fa97c7b54

📥 Commits

Reviewing files that changed from the base of the PR and between b097e38 and 7c925d6.

📒 Files selected for processing (9)
  • backend/src/abort-registry.ts
  • backend/src/index.ts
  • backend/src/mastra/tools/investigate-tool.ts
  • backend/src/mastra/workflows/populate.ts
  • backend/src/mastra/workflows/update.ts
  • frontend/app/dataset/[id]/page.tsx
  • frontend/convex/datasetRows.ts
  • frontend/lib/analytics.ts
  • frontend/lib/backend.ts

Comment thread backend/src/index.ts
Comment thread backend/src/mastra/tools/investigate-tool.ts
Comment thread backend/src/mastra/workflows/populate.ts
Comment thread frontend/app/dataset/[id]/page.tsx
Comment thread frontend/convex/datasetRows.ts
Copy link
Copy Markdown
Collaborator Author

@MMeteorL MMeteorL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All pending comments solved, also resolved the potential risk of terminating a run with spurious AbortError. The stop function well in testing on populate run and update run, and this should be good to go.

Comment thread backend/src/mastra/tools/investigate-tool.ts
Comment thread backend/src/index.ts
Comment thread backend/src/mastra/workflows/populate.ts
Comment thread frontend/app/dataset/[id]/page.tsx
Comment thread frontend/convex/datasetRows.ts
Copy link
Copy Markdown
Collaborator

@giaphutran12 giaphutran12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting changes before merge.

[P1] frontend/app/dataset/[id]/page.tsx:87 introduces setStopping(false) directly inside an effect. In a fresh PR worktree, bun run lint fails with react-hooks/set-state-in-effect on that exact line, so this PR cannot pass the local frontend gate.

Fix: avoid synchronous state updates in the effect. One small option is to schedule the clear (setTimeout(..., 0) with cleanup) or refactor the stop latch so the UI derives pending state from dataset busy status without calling setState directly in the effect body. Re-run bun run lint afterward.

@MMeteorL MMeteorL requested a review from giaphutran12 May 31, 2026 06:34
@MMeteorL
Copy link
Copy Markdown
Collaborator Author

Requesting changes before merge.

[P1] frontend/app/dataset/[id]/page.tsx:87 introduces setStopping(false) directly inside an effect. In a fresh PR worktree, bun run lint fails with react-hooks/set-state-in-effect on that exact line, so this PR cannot pass the local frontend gate.

Fix: avoid synchronous state updates in the effect. One small option is to schedule the clear (setTimeout(..., 0) with cleanup) or refactor the stop latch so the UI derives pending state from dataset busy status without calling setState directly in the effect body. Re-run bun run lint afterward.

Thank you for taking the time to review this! The fix has been applied by removing the effect, and clearing stopping in handlePopulate and handleUpdate.

Copy link
Copy Markdown
Collaborator

@giaphutran12 giaphutran12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Holding for launch safety. The old lint issue is fixed, but the stop button state can still stay latched after a successful stop because stopping is never cleared when the dataset leaves building/updating. Please clear the latch from Convex status or derive the button label from the busy state, then rerun checks.

@MMeteorL
Copy link
Copy Markdown
Collaborator Author

MMeteorL commented Jun 1, 2026

Holding for launch safety. The old lint issue is fixed, but the stop button state can still stay latched after a successful stop because stopping is never cleared when the dataset leaves building/updating. Please clear the latch from Convex status or derive the button label from the busy state, then rerun checks.

The following issues have been fixed, and tests have passed.

Bug 1 — Stop button latch after successful stop
Problem: After clicking Stop and the dataset successfully transitioned out of "building"/"updating", the stopping state was never cleared. The button stayed permanently disabled.

Fix (frontend — frontend/app/dataset/[id]/page.tsx):

Introduced a useEffect that watches isDatasetBusy (derived from Convex's live status). When the dataset leaves the busy state, it schedules setStopping(false) via setTimeout(..., 0) — the setTimeout is required to satisfy the react-hooks/set-state-in-effect lint rule, which prohibits synchronous setState calls in effect bodies.
handleStop no longer clears stopping on success — it stays true until Convex confirms the transition. It only clears immediately on network error, so the user can retry.
stopDisabled = stopping || !isDatasetBusy — both conditions must be false for the button to be active, preventing repeat clicks while a stop request is in flight.

Bug 2 — Stale "building" status after Docker restart
Problem: After a server restart, the in-memory abort registry is wiped, but Convex still shows the dataset as "building"/"updating". Clicking Stop returned a silent no-op, leaving the dataset permanently stuck.

Fix (backend — backend/src/index.ts):

The /stop handler now distinguishes between two cases when abortDataset() returns false:

Before: Treated as "run just finished" — returned 200 with no action.
After: Checks the invariant that the normal finish path always sets a terminal Convex status before deregistering. So if status is still busy but no registry entry exists, the run is definitively orphaned. The handler now force-transitions it to "failed" with the message "Run interrupted: server restarted while building".

Bug 3 — TOCTOU race in abort registry (found during impact audit)
Problem: registerDataset() was called inside the background async functions, after void runXxxWorkflowInBackground() had already returned control to the route handler. A /stop request arriving in that tiny gap would see an empty registry, incorrectly trigger the orphan path, force-transition the dataset to "failed" — and then the real runner would later overwrite it back to "live", corrupting state.

Fix (backend — backend/src/index.ts):

For both /populate and /update: registerDataset() is now called in the route handler, synchronously, before the void call. This guarantees the registry entry is visible the instant the 202 response is sent.
runPopulateWorkflowInBackground was updated to accept the pre-created AbortController as a parameter (instead of creating it internally).
Both background functions had their internal registerDataset() calls removed (replaced with explanatory comments).

MMeteorL and others added 8 commits June 1, 2026 16:35
Adds a Stop button in Settings that cancels in-flight populate or update runs, preserving all collected rows and transitioning the dataset to 'live'.

Implementation:
- abort-registry.ts: module-level Map<runId, AbortController> shared across the process
- /stop route: looks up the active runId for a dataset, fires AbortController.abort()
- AbortSignal threaded via abortSignal parameter to every agent.generate() call (orchestrator, subagent, refresh agents)
- run_subagent tool re-throws AbortError so cancellation propagates up to the orchestrator
- Update workflow: processRow re-throws AbortError so workers exit early; a post-concurrency check bulk-clears any remaining pending row updateStatus fields
- Background runners detect controller.signal.aborted in catch, set status → 'live', send ready notification with collected row count
- Convex: new clearAllPendingUpdateStatus internal mutation to bulk-reset pending shimmers on stop
- runStats still saved via existing finally block (recorded as stopped with errorMsg "Stopped by user")

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three refactors to the stop-population implementation:

1. Key abort-registry by datasetId instead of workflowRunId
   Convex's atomic claim already guarantees at most one active run per
   dataset, so datasetId is a valid unique key. This eliminates the
   activeDatasets Map that existed only to bridge datasetId → runId →
   AbortController, and simplifies all call sites (steps already have
   authorizedDatasetId/datasetId in scope).

2. Extract finaliseRunAsLive() helper
   Both background runners had identical ~12-line blocks to handle a
   user stop: query dataset, set status "live", count rows, conditionally
   email. Extracted into a shared helper.

3. Remove dead abort catch in runUpdateWorkflowInBackground
   When a stop fires during an update, processRow re-throws AbortError,
   Promise.allSettled winds down the workers, refreshRowsStep detects
   signal.aborted, clears pending rows, and returns normally. Mastra
   sees a successful step, run.start() returns {status:"success"}, and
   the existing success path handles the live transition. The
   controller.signal.aborted branch in the catch was unreachable.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Backend fixes:
- finaliseRunAsLive failure: if the live-status mutation fails after a user
  stop, fall back to setting status "failed" so the dataset always leaves
  "building". Previously a failed finalisation returned without touching
  the status, leaving the dataset permanently stuck with no registry entry
  for /stop to act on.
- investigate-tool maxSteps: fix 10 → 25 to match the contract in CLAUDE.md
- enumerateStep abort: thread abortSignal into generateText so a stop
  pressed during enumeration (a ~10-token LLM call) cancels immediately
  rather than waiting for the call to complete. Re-throws AbortError so
  Mastra marks the step failed and the background runner's abort-detection
  fires.

Frontend fixes:
- stopping latch: don't clear stopping in handleStop's finally. Instead,
  a useEffect clears it when dataset.status transitions out of "building"/
  "updating". This keeps the Stop button in "Stopping…"/disabled state
  until Convex confirms the run has actually finished, preventing the
  re-enable flash and duplicate stop requests during the cleanup window.
  Only clears immediately on fetch error (run was never reached).
- backend.ts: extract backendPost<T> helper — 4 functions were duplicating
  the same fetch/auth-header/error-parse boilerplate.
- clearAllPendingUpdateStatus: add scale note about Convex per-mutation
  document limits.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
maxSteps in investigate-tool.ts: revert 25 → 10 to match main. The
CodeRabbit comment cited CLAUDE.md describing the extract-tool-pipeline
architecture where subagents use 25 steps — not applicable on main.

backendPost helper: revert to individual fetch functions. The refactor
adds a layer of indirection without making the stop feature safer or
simpler. Code quality cleanups belong in a separate PR.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add a compound index by_dataset_update_status (["datasetId",
"updateStatus"]) to datasetRows so the mutation can query only the
pending rows directly rather than scanning the entire dataset table.

Before: collect() all rows for the dataset, filter in TS, patch pending.
After: index query returns only pending rows; no other rows are scanned.

Requires make convex-push to deploy the schema change.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Tighten all three AbortError re-throw guards to check
getSignal(datasetId)?.aborted before propagating. Without this, spurious
network AbortErrors (undici ECONNRESET, dropped SSE streams, SDK-internal
cleanup aborts) were re-thrown as if the user had pressed Stop.

In investigate-tool.ts this caused the orchestrator's agent.generate() to
receive a tool-thrown AbortError and exit early with a graceful empty
result, producing a successful workflow run with 0 rows inserted. The fix
restores the original structured-failure return path for all non-user aborts
so the orchestrator can log the failure and continue with other entities.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Calling setStopping(false) synchronously inside a useEffect body
triggers the react-hooks/set-state-in-effect rule, blocking the
frontend lint gate.

Remove the effect entirely. The stopping latch only needs to reset
before the *next* populate or update run starts. Both handlePopulate
and handleUpdate are already guarded against running while the dataset
is busy, so they only fire after the status has settled to live/failed.
Adding setStopping(false) at the top of each handler — right before
setPopulating/setUpdating — is both correct and semantically clearer:
"starting a new run discards any leftover stop-latch from the prior one."

While the dataset is non-busy, the Stop button is hidden entirely
(replaced by Update/Clear & Populate), so the stale stopping=true value
is invisible and harmless until those handlers run.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Frontend: hoist isDatasetBusy before the loading guard (optional chaining)
  so the useEffect dep array never hits the TDZ during the loading state.
  Add a useEffect + setTimeout to clear the `stopping` latch once Convex
  confirms the dataset has left the busy state, satisfying the
  react-hooks/set-state-in-effect lint rule.

- Backend (/stop): detect orphaned datasets (busy in Convex but no abort
  registry entry) and force-transition them to "failed" so they are never
  permanently stuck after a server restart.

- Backend (TOCTOU): move registerDataset() from inside the background async
  functions to the route handlers, before the void call. This closes the race
  window where a /stop arriving before the background function ran
  registerDataset would incorrectly trigger the orphan path against a live run.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@MMeteorL MMeteorL force-pushed the feature/stop-population branch from 6665055 to fb256bf Compare June 1, 2026 23:40
@MMeteorL MMeteorL requested a review from giaphutran12 June 2, 2026 01:48
Copy link
Copy Markdown
Contributor

@simantak-dabhade simantak-dabhade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Great feature.

@simantak-dabhade simantak-dabhade removed the request for review from giaphutran12 June 2, 2026 22:41
@simantak-dabhade simantak-dabhade dismissed giaphutran12’s stale review June 2, 2026 22:43

Addressed in follow-up commits; proceeding with owner-requested merge.

@simantak-dabhade simantak-dabhade merged commit 870dbd7 into main Jun 2, 2026
3 checks passed
@simantak-dabhade simantak-dabhade deleted the feature/stop-population branch June 2, 2026 22:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants