Skip to content

feat(sdk): AI SDK custom useChat transport & chat.task harness#3173

Open
ericallam wants to merge 154 commits intomainfrom
feature/tri-7532-ai-sdk-chat-transport-and-chat-task-system
Open

feat(sdk): AI SDK custom useChat transport & chat.task harness#3173
ericallam wants to merge 154 commits intomainfrom
feature/tri-7532-ai-sdk-chat-transport-and-chat-task-system

Conversation

@ericallam
Copy link
Copy Markdown
Member

@ericallam ericallam commented Mar 4, 2026

tl;dr

Run AI SDK chat completions as durable Trigger.dev agents.

Define your agent in one function, wire useChat to it from React, and the conversation survives page refreshes, network blips, and process restarts. Tools, multi-turn state, HITL approvals, stop-mid-stream, branching, hydration from your own DB. Frontend stays standard AI SDK useChat — only the transport changes.

This PR ships the headline (chat.agent), the durability primitive underneath it (sessions), the browser transport, agent-side hooks, Agent Skills, an offline test harness, AI SDK tool helpers, an opt-in fast-path for cold-start TTFC (chat.headStart), and an MCP integration so AI assistants drive the same machinery the browser does.

What's in

chat.agent({ id, run }) The headline. Define your agent in one function, pass it to useChat from React, conversation persists.
sessions primitive Durable, task-bound, bidirectional channel pair (session.in / session.out) keyed on externalId. One identity, many runs over time. Powers chat.agent and unblocks "approval loop" / "resume tomorrow" workflows generally.
chat.headStart Opt-in fast path: run step 1 in your warm Next.js / Hono / Workers / Express handler while the agent boots in parallel. Cold-start TTFC drops ~50% on the first message; the agent still owns step 2+.
Agent Skills Drop a folder with SKILL.md next to your task, register with skills.define(), agent gets a one-line summary in its prompt and discovers full instructions on demand. CLI bundles the folder into the deploy image automatically.
mockChatAgent Unit-test agent definitions offline. Drives the real turn loop in-process, no network, no task runtime.
ai.toolExecute(task) / ai.tool(task) Wire a Trigger subtask in as the execute of an AI SDK tool(). Per-tool isolation, retries, observability, shaped like ordinary AI SDK tools.
MCP agent-chat tools Now run on Sessions, so AI assistants driving an agent get the same idempotent-by-chatId, durable-across-runs behavior the browser does.

chat.agent

// trigger/chat.ts
import { chat } from "@trigger.dev/sdk/ai";
import { streamText } from "ai";
import { openai } from "@ai-sdk/openai";

export const myChat = chat.agent({
  id: "my-chat",
  run: async ({ messages, signal }) =>
    streamText({ model: openai("gpt-4o"), messages, abortSignal: signal }),
});
// app/components/chat.tsx
import { useChat } from "@ai-sdk/react";
import { useTriggerChatTransport } from "@trigger.dev/sdk/chat/react";

const transport = useTriggerChatTransport({
  task: "my-chat",
  accessToken: ({ chatId }) => mintChatAccessToken(chatId),
  startSession: ({ chatId, taskId }) => startChatSession({ chatId, taskId }),
});

const { messages, sendMessage, stop, status } = useChat({ transport });

That's the floor. Layer in lifecycle hooks (onPreload, onTurnStart, onTurnComplete, onValidateMessages, onBeforeTurnComplete, onChatStart, onWait) for persistence, validation, and pre-stream work; chat.store for typed shared-data slots both sides read/write; chat.endRun() for clean exit; transport.watch(chatId) for read-only dashboard tabs that observe a run without driving it; chat.requestUpgrade() for end-and-continue handoff to a fresh run on a new version.

Agents appear under Agents in the dashboard (separate from Tasks) and have their own Playground for testing.

Sessions

The primitive chat.agent is built on. One externalId (your chatId), many runs over time, with a stable .in channel clients write to and .out channel they subscribe to:

import { sessions } from "@trigger.dev/sdk";

const session = await sessions.create({
  externalId: chatId,
  taskIdentifier: "my-task",
});

await session.in.send({ kind: "message", payload: "..." });
for await (const chunk of session.out.read()) {
  /* render */
}

Inside the task, .in.wait() and .waitWithIdleTimeout() suspend the run on a session-stream waitpoint until the next record arrives. .out.append / .pipe / .writer produce records via direct-to-S2 writes. List sessions with sessions.list({ type, tag }) for inbox-style UIs.

A chat you were in yesterday resumes against the same session today, even after the original run idle-timed out or crashed. Pass resume: true on page load and the transport reconnects via sessionId + lastEventId, kicking off a new run only when the user sends.

chat.headStart

Cold-start tax for an agent's first turn is ~1.3s of boot + hooks before the LLM response can stream. chat.headStart runs step 1 in your warm Next.js / Hono / Workers / Express process while the agent run boots in parallel:

// app/api/chat/route.ts (any Web Fetch handler)
import { chat } from "@trigger.dev/sdk/chat-server";
import { streamText } from "ai";
import { openai } from "@ai-sdk/openai";
import { tools } from "@/lib/chat-tools-schemas";

export const POST = chat.headStart({
  agentId: "my-chat",
  run: async ({ chat: chatHelper }) =>
    streamText({
      ...chatHelper.toStreamTextOptions({ tools }),
      model: openai("gpt-4o"),
      system: "...",
    }),
});
// browser: opt in by pointing the transport at your handler
const transport = useTriggerChatTransport({
  task: "my-chat",
  accessToken,
  headStart: "/api/chat",
});

Pure-text first turns finish on the handler side (no LLM call from the trigger run at all). Tool-calling first turns hand ownership to the agent at the tool-call boundary so heavy execute deps stay in the trigger task. Subsequent turns bypass the endpoint entirely. Web Fetch by default; chat.toNodeListener(handler) for Express / Fastify / Koa. Verified locally: ~53% TTFC reduction (1561ms vs 3358ms) with persistence and tool execution behaving identically.

Agent Skills

Behavior packaged as a folder, version-controlled, bundled with the deploy image:

import { chat } from "@trigger.dev/sdk/ai";
import { skills } from "@trigger.dev/sdk";

const pdfSkill = skills.define({
  id: "pdf-extract",
  path: "./skills/pdf-extract",
});

export const agent = chat.agent({
  id: "docs-chat",
  onChatStart: async () => {
    chat.skills.set([await pdfSkill.local()]);
  },
  run: async ({ messages, signal }) => streamText({ /* ... */ }),
});

The agent gets a short summary in its system prompt and loads full instructions on demand via the built-in loadSkill tool. bash and readFile tools are scoped per-skill (path-traversal guards, output caps, abort-signal propagation). No trigger.config.ts changes needed; the CLI's indexer picks the folder up automatically. Built on the AI SDK cookbook agent-skills pattern, portable across providers.

mockChatAgent

Agent definitions are now unit-testable offline:

import { mockChatAgent } from "@trigger.dev/sdk/ai/test";
import { MockLanguageModelV3 } from "ai/test";

const harness = mockChatAgent(myChat, {
  setupLocals: ({ locals }) => locals.set(dbKey, fakeDb),
});

await harness.send({ text: "hi" });
expect(harness.allChunks).toContainText("hello");
expect(harness.hooks.onTurnComplete).toHaveBeenCalledTimes(1);

Drives the real turn loop in-process — no network, no task runtime. Pairs with MockLanguageModelV3 from ai/test for model mocking. The broader runInMockTaskContext it sits on is exported from @trigger.dev/core/v3/test for unit-testing any task code.

AI SDK tool helpers

import { ai } from "@trigger.dev/sdk/ai";
import { tool } from "ai";
import { z } from "zod";

const myTool = tool({
  description: "Look up a customer by id",
  inputSchema: z.object({ id: z.string() }),
  execute: ai.toolExecute(lookupCustomerSubtask),
});

ai.toolExecute(task) keeps the tool surface yours (description, schema, etc.) and just plugs Trigger's subtask machinery into the body. ai.tool(task) (the old toolFromTask) keeps doing the all-in-one wrap. Min ai peer is ^6.0.116 to avoid cross-version ToolSet mismatches in monorepos.

Browser transport hardening

  • Resilient SSE reconnection — backoff + last-event-id replay so brief network blips don't drop turns.
  • ChatChunkTooLargeError for chunks that exceed the wire limit (with size in the message), so streamText blowups don't swallow the cause.
  • endpoint / headStart opt-in for the transport (above).
  • Multi-tab read-only mode via transport.watch(chatId) for dashboard tabs that observe a run without driving it.

MCP agent-chat integration

The CLI MCP server's start_agent_chat / send_agent_message / close_agent_chat tools now run on Sessions, so AI assistants driving an agent get the same idempotent-by-chatId, durable-across-runs behavior the browser does. Required PAT scopes change from write:inputStreams to read:sessions + write:sessions.

Other fixes

  • Fix dev workers spinning at 100% CPU after the parent CLI disconnects (orphaned worker IPC feedback loop, see dev-worker-disconnect-loop changeset for the gory details).
  • fix(webapp) for the playground "save" action's uncaught JSON.parse (now returns a clean 400 instead of an unhandled 500).
  • typesVersions entry for v3/chat-client + inline CodeQL guards.

Docs

Full guide at /ai-chat — overview, quick-start, frontend, backend, sessions, head start, hooks, persistence, hydration, types, testing, MCP, reference. Sequence diagrams cover first-turn / multi-turn / stop-signal / head-start (pure-text and tool-call paths).

Reference project

references/ai-chat demonstrates everything end-to-end: persistent chat, branching, multi-tab, head-start toggle, hydration mode, upgrade flow.

Versions

  • @trigger.dev/sdkminor bump (chat.agent, sessions, chat.headStart, ai tool helpers, mockChatAgent, agent skills)
  • @trigger.dev/core — patch
  • @trigger.dev/build — patch (Skills bundling)
  • trigger.dev (CLI) — patch (Skills bundling, MCP Sessions migration, dev-worker disconnect fix)
  • AI SDK peer raised to ai@^6.0.116.

Refs TRI-7532.

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Mar 4, 2026

🦋 Changeset detected

Latest commit: 02e4a2b

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 30 packages
Name Type
@trigger.dev/sdk Minor
@trigger.dev/core Minor
@trigger.dev/build Minor
trigger.dev Minor
@trigger.dev/python Minor
@internal/sdk-compat-tests Patch
references-ai-chat Patch
d3-chat Patch
references-d3-openai-agents Patch
references-nextjs-realtime Patch
references-realtime-hooks-test Patch
references-realtime-streams Patch
references-telemetry Patch
@trigger.dev/redis-worker Minor
@trigger.dev/schema-to-json Minor
@internal/cache Patch
@internal/clickhouse Patch
@internal/llm-model-catalog Patch
@internal/redis Patch
@internal/replication Patch
@internal/run-engine Patch
@internal/schedule-engine Patch
@internal/testcontainers Patch
@internal/tracing Patch
@internal/tsql Patch
@internal/zod-worker Patch
@trigger.dev/react-hooks Minor
@trigger.dev/rsc Minor
@trigger.dev/database Minor
@trigger.dev/otlp-importer Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 4, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds a browser-safe chat transport and factory (TriggerChatTransport, createChatTransport) and a React hook (useTriggerChatTransport) under @trigger.dev/sdk/chat. Extends the backend AI SDK (@trigger.dev/sdk/ai) with chat primitives (chatTask, pipeChat, createChatAccessToken, CHAT_STREAM_KEY), many chat-related types, and runtime helpers. Implements per-item oversized NDJSON handling (OversizedItemMarker, extractIndexAndTask) and removes BatchItemTooLargeError/related size checks. Adds InputStreamManager methods (setLastSeqNum, shiftBuffer, disconnectStream) and introduces StreamWriteResult and new realtime options (spanName, collapsed). Updates package exports, docs, tests, and package-installation guidance.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~150 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is entirely missing. The author provided no description content, violating the template requirement for testing details, changelog, and confirmation of following contributing guidelines. Add a detailed PR description including testing steps, a changelog summary, and confirmation that contributing guidelines were followed per the provided template.
Docstring Coverage ⚠️ Warning Docstring coverage is 68.18% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly summarizes the main change: introducing AI SDK custom useChat transport and chat.task harness, which aligns with the extensive additions across chat transport, backend task handling, and React integration.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/tri-7532-ai-sdk-chat-transport-and-chat-task-system

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ericallam ericallam changed the title feature/tri-7532-ai-sdk-chat-transport-and-chat-task-system feat(sdk): AI SDK custom useChat transport & chat.task harness Mar 4, 2026
coderabbitai[bot]

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

github-advanced-security[bot]

This comment was marked as resolved.

github-advanced-security[bot]

This comment was marked as resolved.

ericallam added 19 commits May 5, 2026 11:06
… panel + sendAction bridge

UX cleanup discovered during the Sessions e2e sweep. Three changes, one commit because they all live in the chat input row / debug panel area:

- Explicit "Preload" button next to "Send" that only renders when the chat has no messages and no session yet. Clicking calls transport.preload(chatId), which mints the session and triggers the first run with trigger:"preload". Self-hides once session is truthy. Replaces the inert "Preload new chats" sidebar checkbox (the visible `+ New Chat` button only navigated and never called transport.preload — preloadEnabled was wired through the context but read by nobody, since ChatApp.tsx is no longer the mounted chat sidebar). Drops the dead preloadEnabled state + checkbox from chat-settings-context, chat-sidebar, chat-sidebar-wrapper, and the chat-app.tsx legacy code path.

- Debug panel "Runs → View in dashboard" row, gated on dashboardUrl + a new NEXT_PUBLIC_TRIGGER_PROJECT_DASHBOARD_PATH env var. Resolves to the runs-list page filtered by chat:<chatId> tag — so opening the link drops you straight into the run list for the active chat. Threads the new prop through chat-view → chat → DebugPanel.

- window.__chat.sendAction(action) bridge wrapper that delegates to transport.sendAction(chatId, action). Lets smoke tests drive aiChatHydrated's actionSchema (undo/rollback/remove/replace) without reaching into React internals.
CreateSessionRequestBody now requires `taskIdentifier` and `triggerConfig` because Sessions are task-bound (the server reuses the config for every run scheduled by the session — initial + continuations). The MCP `agentChat` tool was still passing only `{ type, externalId }` from the pre-Sessions-as-run-manager API. Add `taskIdentifier: input.agentId` and a minimal `triggerConfig` with `basePayload: { chatId, ...clientData }` and the `chat:{chatId}` auto-tag.

Unblocks typecheck on PR #3173 (and Windows CLI v3 e2e, which builds cli-v3 in pre-test).
Migration 029 added `task_kind` to `task_runs_v2`, and TASK_RUN_COLUMNS was updated, but the four test-data arrays in src/taskRuns.test.ts were not. ClickHouse rejects the inserts with "Cannot parse input: expected ',' before: ']'" because the array length is one short of the column count. All 7 internal/clickhouse unit-test shards on PR #3173 fail on this.

Pre-existing bug (predates my Sessions work) but blocking CI; verified the fix locally — `vitest run src/taskRuns.test.ts` now passes 4/4.
…messages: []` in basePayload

Server-to-agent flows (`AgentChat` SDK class + cli-v3 MCP `start_agent_chat`) were building `triggerConfig.basePayload` without the `trigger: "preload"` and `messages: []` fields the agent runtime branches on. Result: the auto-triggered first run had `payload.trigger === undefined`, neither `onPreload` nor `onChatStart` fired, and `onTurnStart`'s DB-write blew up with PrismaClient "No record found" because no Chat row had been created.

Browser-mediated flows already had this right (`chat.createStartSessionAction` in `ai.ts:6951`); the server-side path now mirrors that shape.

- packages/trigger-sdk/src/v3/chat-client.ts — `AgentChat.ensureStarted` adds the two fields to `basePayload`. `chat-client-test`'s `pong` orchestrator now returns the assistant text instead of an empty string.

- packages/cli-v3/src/mcp/tools/agentChat.ts — same fix on `start_agent_chat`'s `createSession` call. Also drops the redundant separate `apiClient.triggerTask(...)` call: `POST /api/v1/sessions` now auto-triggers the first run and returns its runId, so a second trigger from the MCP would have produced a competing run on the same session. Use `session.runId` from the create response. The `preload` input flag becomes a no-op signal (response message wording only) since session-create always triggers a run now.

Verified end-to-end against local:
- `chat-client-test` orchestrator returns `{ text: "pong" }`
- MCP `start_agent_chat` → `send_agent_message` x2 → `close_agent_chat` succeeds, both turns reuse the same runId
The realtime stream caps each record at ~1 MiB. Today the chat.agent path
through StreamsWriterV2 surfaces a generic S2Error from deep in the
batching layer when a chunk exceeds the cap, with no chunk-type context
and no guidance for callers.

Add a pre-write byte check in StreamsWriterV2.initializeServerStream that
fires before the chunk hits the underlying batcher, and a typed
ChatChunkTooLargeError carrying the chunk's discriminant (type/kind),
serialized size, and cap. Also exports an isChatChunkTooLargeError guard
from the SDK so callers can branch cleanly.

Threshold is 1 MiB minus 1 KiB to leave headroom for the JSON record
envelope. The error message links to the new docs pattern (Pattern:
ID-reference for large tool outputs / out-of-band streams.writer for
run-scoped data).
- typesVersions: add `ai/skills-runtime` mapping (was missing → check-exports
  failed with NoResolution on `@trigger.dev/sdk/ai/skills-runtime`).
- chat.store JSON Patch: reject `__proto__`, `constructor`, `prototype`
  segments at parseJsonPointer. Closes the two CodeQL prototype-pollution
  alerts on chat-client.ts:108 / :120 — a malicious patch like
  `{ op: "replace", path: "/__proto__/x", value: 1 }` would otherwise
  walk into Object.prototype via `parent[lastToken] = value`. Throws a
  clear error on the whole patch instead.
- typesVersions: add `v3/chat-client` mapping. The export was declared in
  `tshy.exports` and the conditional export block but missing from
  `typesVersions` — `attw --pack` flagged "@trigger.dev/core/v3/chat-client"
  as `node10: 💀 Resolution failed`.
- chat.store JSON Patch: add an `assertSafeKey` guard at the assignment
  sites in `removeAt` / `insertAt`. parseJsonPointer already rejects
  `__proto__` / `constructor` / `prototype`, but CodeQL's prototype-pollution
  analysis doesn't trace through the parser boundary — the local check at
  the assignment keeps the static analysis happy and is also a real
  defense-in-depth backstop against any future caller that bypasses
  parseJsonPointer.
…SessionTriggerConfig + sync playground transport clientData

Two fixes from Devin's review on PR #3173.

## SessionTriggerConfig is missing 3 fields the playground UI shows

The playground sidebar (`PlaygroundSidebar`) renders working controls for
`maxDuration`, `version`, and `region`. The action received the form fields,
but `SessionTriggerConfig` didn't accept them so they were `void`-suppressed
and silently dropped. Runs ignored the user's max-duration cap, the version
pin didn't apply, and region selection had no effect.

- `packages/core/src/v3/schemas/api.ts` — add three optional fields to
  `SessionTriggerConfig`: `maxDuration` (positive int, seconds),
  `lockToVersion` (string), `region` (string). All three forward to the
  matching field on `TaskRunOptions`.
- `apps/webapp/app/services/realtime/sessionRunManager.server.ts` — extend
  `triggerSessionRun`'s `body.options` to thread the three fields through
  to `TriggerTaskService` when present.
- `apps/webapp/app/routes/resources.orgs.$organizationSlug.projects.$projectParam.env.$envParam.playground.action.tsx`
  — fold the three form fields into `triggerConfig`; remove the `void`
  suppressions.

## Playground transport's clientData becomes stale after edits

The route constructs `TriggerChatTransport` directly via `useRef` (to avoid
the React-version mismatch the hook had). The hook normally calls
`setClientData` whenever `clientData` changes, but this manual construction
bypassed that — so `clientData` was captured at construction and never
updated. Per-turn `metadata` merges (`this.defaultMetadata` in
`packages/trigger-sdk/src/v3/chat.ts`) used the stale initial value for
the whole conversation. `startSession` was already reading from the live
ref so session creation was unaffected; this only fixed the per-turn path.

- `apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.playground.$agentParam/route.tsx`
  — add a `useEffect` that calls `transport.setClientData(...)` whenever
  `clientDataJson` changes.

Changeset (patch, @trigger.dev/core) for the schema additions; server-
changes file for the webapp-only behaviour fix.
Roll up all the chat.agent feature work that's been accumulating on this
branch into 8 user-facing CHANGELOG entries. No behavior change — just
tidying up the .changeset/ directory before merge.

Final shape:

- chat-agent.md (sdk minor + core patch) — the headline; folds 13:
  ai-sdk-chat-transport, ai-chat-sandbox-and-ctx, chat-agent-*,
  chat-customagent-session-binding-and-stop-fixes,
  chat-reconnect-isstreaming-optional, chat-run-pat-renewal,
  chat-store-primitive, chat-transport-session-renew-plus-preload,
  drop-legacy-chat-stream-constants, dry-sloths-divide,
  trigger-chat-transport-watch-mode.
- sessions-primitive.md (core + sdk patch) — folds 3: session-primitive,
  session-sdk-toolkit, session-trigger-config-extra-fields.
- agent-skills.md (sdk + core + build + cli patch) — folds 2:
  chat-agent-skills-phase-1, skills-runtime-subpath.
- ai-tool-helpers.md (sdk patch) — folds 2: ai-tool-execute-helper,
  ai-tool-toolset-typing.
- mock-chat-agent-test-harness.md (sdk + core patch) — folds 3:
  mock-chat-agent-test-harness, mock-task-context-test-infra,
  mock-chat-agent-setup-locals.
- mcp-agent-chat-sessions.md (cli patch) — kept standalone.
- add-is-replay-context.md (core patch) — kept standalone (general task feature).
- truncate-error-stacks.md (core patch) — kept standalone (general infra).

Bumps preserved (chat-agent stays minor on sdk; everything else patch).
Auto-named "dry-sloths-divide" got merged into chat-agent and dropped.
The previous pass rolled 26 changesets into 8 but the consolidated
descriptions read like docs (full API surface dumps, multiple sections,
docs-style headers). Rewrote each so they fit a release-notes bullet
list — short, what-shipped framing, with one or two snippets where they
help, no exhaustive type / option enumeration.
- inline prototype-pollution guards at JSON Patch assignment sites in chat-client.ts so CodeQL can statically verify them (Set.has() check upstream wasn't being traced)
- wrap JSON.parse(payloadStr) in playground action's start handler to return 400 on malformed JSON instead of 500
Replace the legacy 5-attempt retry cap on SSEStreamSubscription with
indefinite retry on a bounded jittered backoff. Adds a force-reconnect
path so the chat transport can recover from silent-dead-socket cases
on mobile (background-kill, bfcache restore) without waiting for the
next backoff slot.

SSEStreamSubscription:
  - maxRetries default Infinity (was 5), retryDelayMs 100ms (was 1s),
    new maxRetryDelayMs cap (5s), retryJitter 50%
  - retryNow(): wake an in-flight backoff
  - forceReconnect(): drop current connection AND wake backoff
  - fetchTimeoutMs (30s default): aborts stuck connect attempts that
    block forever on dead sockets
  - stallTimeoutMs (opt-in): force reconnect on silent reader
  - nonRetryableStatuses (default [404, 410]): short-circuit retry
    for stream-gone / session-closed
  - Fixed listener leak where each retry accumulated an abort listener
    on the user signal because finally only ran once the recursion
    unwound. Cleanup now runs per-attempt via cleanupAttempt() in both
    the catch (before recursion) and finally paths.

TriggerChatTransport (browser):
  - online        -> forceReconnect (existing socket may be stale)
  - pageshow.persisted -> forceReconnect (Safari bfcache restore)
  - visibilitychange -> visible only:
      * hidden >= 30s -> forceReconnect
      * hidden < 30s  -> retryNow (cheap wake)
  - stallTimeoutMs: 60s (sized over typical agent thinking pauses)

Tests: 13 vitest cases covering retry-past-legacy-cap, backoff cap,
jitter variance, retryNow short-circuit, abort-during-backoff,
forceReconnect during fetch and during read (verifies Last-Event-ID
resume on the resumed request), fetchTimeout, stallTimeout, 404/410
short-circuit, custom nonRetryableStatuses, 503 still retries.

Refs TRI-8903.
Adds an opt-in fast path that runs step 1 streamText in the warm
customer process (Next.js, Hono, Workers, Express, etc.) while the
trigger agent run boots in parallel. Pure-text turns finish on the
handler side; tool-call turns hand ownership to the agent at the
tool-call boundary via a `kind: "handover"` chunk on session.in.

- New @trigger.dev/sdk/chat-server subpath with chat.headStart,
  chat.openSession (escape hatch), and chat.toNodeListener (Express /
  Fastify / Koa bridge from Web Fetch handler to (req, res)).
- Wire-format: ChatInputChunk gains kind: "handover" with isFinal flag
  and partialAssistantMessage; trigger payload kind: "handover-prepare"
  for the boot-and-wait variant.
- Run-loop: handover-prepare branch waits on session.in, then either
  skips userRun (isFinal: true → pure-text) or seeds accumulators and
  resumes step 2+ from tool-output-available (isFinal: false).
- Browser: TriggerChatTransport gains an optional `headStart` URL.
  First-turn POSTs go there; turn 2+ bypasses and writes session.in.
- Tests: chat-server.test.ts (handover dispatch, isFinal routing) and
  chatHandover.test.ts (run-loop branching, hook ordering, idle-timeout
  exit, schema-only-on-handler / executes-on-agent tool round).
Adds a /api/chat route handler exporting chat.headStart, splits the
tool definitions across two modules so heavy executes never reach the
browser bundle, and exposes a sidebar toggle for paired TTFC tests.

- src/lib/chat-tools-schemas.ts (new): schema-only tool definitions —
  imported by both the route handler and the agent task. No `execute`,
  no heavy deps. Bundle stays small.
- src/trigger/chat-tools.ts (renamed): re-exports the schemas with
  agent-side `execute` fns added (E2B sandbox, turndown, deepResearch
  subtask, etc.). Only the trigger task imports this.
- src/app/api/chat/route.ts (new): exports POST = chat.headStart, runs
  step 1 streamText with claude-sonnet-4-6 to match the agent's default.
- ChatSettingsContext + sidebar gain a "Use handover (1st turn)"
  toggle; chat-view threads it into the transport's `headStart` URL.
- Smoke result: ~53% TTFC reduction on first turn (1561ms vs 3358ms),
  with persistence + tool execution behaving identically.
…rmed input

Mirrors the 'start' case (lines 104-108) — uncaught JSON.parse on a
malformed messages form field surfaced as an unhandled 500 instead of
a clean 400. Addresses Devin review on PR #3173.
…e variants

Mirror the stamping + read-precedence work from TRI-9073 in the
ai-chat-feature-branch-only routes:

- Playground action: stamp `streamBasinName` from
  `environment.organization.streamBasinName` on Session.upsert.
- Playground SSE / append routes: pass `{ session }` to
  `getRealtimeStreamInstance` so basin resolves via session row.
- Dashboard run-stream / run-input / run-session SSE routes (the
  dashboard-auth counterparts to the public /realtime/v1/* routes):
  same plumbing.

These files only exist on this branch (they were added by the
chat-agent / Sessions PRs), so the plumbing rides along here rather
than on the basin migration branch.

Refs TRI-9073.
…idle test

Two unrelated fixes that both block the ai-chat feature branch.

apps/webapp queues concern — locked + specified-queue branch was
silently dropping `taskKind`. The TTL-skip optimization on the
backgroundWorkerTask lookup also skipped the only place we read
`triggerSource`, so AGENT and SCHEDULED runs triggered with both
`lockToVersion` and a queue override were annotated as STANDARD and
disappeared from the run-list "Source" filter (and replicated to
ClickHouse with `task_kind = 'STANDARD'`). The lookup now always
runs and includes `triggerSource` in the same select; ttl is still
gated on the override being absent. Mirrors the sibling locked-with-
default-queue branch (line ~162) and the non-locked branch's
`getTaskQueueInfo`.

trigger-sdk test harness — `mockChatAgent` was leaving an
`ApiClientMissingError` unhandled-rejection trail when an agent's
suspend path tripped (the `chat.handover` idle-timeout test reliably
hit it). The harness reused the real `SessionInputChannel`, whose
`wait()` calls `apiClientManager.clientOrThrow()` — fine in
production, fatal in a test process with no `TRIGGER_SECRET_KEY`.
Added a `TestSessionInputChannel` subclass that overrides only
`wait()` and resolves `{ok:false}` when the harness's run signal
aborts; `on`/`once`/`peek`/`send` continue to flow through the real
`sessionStreams` global. The harness threads its `runSignal.signal`
in via a lazy getter so the channel reads it after the controller is
constructed.

All 97 sdk tests pass; webapp typecheck is clean.
…und action

ApiRunListPresenter was returning `run.taskKind` raw from ClickHouse
where the column defaults to `""` for pre-migration rows, while the
dashboard's NextRunListPresenter normalizes to `"STANDARD"`. API
consumers and the dashboard now agree.

Playground action's two `as unknown as AuthenticatedEnvironment`
casts were redundant — `findEnvironmentBySlug` already returns
`Promise<AuthenticatedEnvironment | null>`. Drop the casts (and the
now-unused import) so a future change to the function's return shape
actually surfaces as a type error instead of crashing at runtime.
`StandardSessionStreamManager#ensureTailConnected` re-subscribes the SSE
tail in `.finally` whenever handlers or once-waiters remain on the key.
That's the right move for unexpected tail crashes, but wrong when
`session.in.wait()` calls `disconnectStream` to suspend the run via a
waitpoint: the run-level `stopInput.on(...)` registered at the top of
the `chat.agent` loop keeps the handlers set non-empty, so during the
suspend window the tail silently resurrects, the next user message
arrives at S2, the tail dispatches it, `stopSub`'s "kind === stop"
filter rejects it, the data falls into the buffer, the waitpoint
*also* delivers the same record, the SDK resumes, and on the next
turn's `messagesInput.on(...)` registration the buffer drain re-fires
the handler — `pendingMessages` ends up holding a duplicate of the
just-consumed message and the loop runs an extra LLM turn with
identical content.

Track explicit teardown via `explicitlyDisconnected: Set<string>`.
`disconnectStream` adds the key, `.finally` bails when set, `on()` /
`once()` clear it so future re-attaches reconnect normally. Honors
`wait()`'s expectation that explicit disconnect ⇒ no records buffered
or delivered until a fresh `on()`/`once()`, while preserving
auto-reconnect for legitimate failures (network drops, etc.).

Verified end-to-end against a `chat.agent` reproduction that
previously fired three turns per submitted message after suspend; with
the fix exactly one turn per message, single LLM call, single
persisted assistant reply.

Trivial: `wait()` extracts `nextSeq` to a local for readability.
devin-ai-integration[bot]

This comment was marked as resolved.

ericallam added 2 commits May 5, 2026 18:14
`SSEStreamSubscription.connectStream` invokes `onError` twice for 401/403
responses: first in the `!response.ok` branch where the auth ApiError is
constructed (so consumers see the original failure status/headers), then
again in the catch block's `isTriggerRealtimeAuthError` arm before
terminating the stream. Drop the second call — the early one already
notified the consumer; the catch block's job is just to route the error
to `controller.error` so retry doesn't fire.

Spotted by Devin on PR 3173.
Adds `taskContext.setConversationId()` to the core API so the chat.task
and chat.agent run boots can flag a chat run with the OTel GenAI
`gen_ai.conversation.id` semantic attribute. The TaskContextSpanProcessor
stamps it on every span at start and TaskContextMetricExporter copies it
into every metric data point — `ctx.*` is filtered by the OTLP ingest,
but `gen_ai.*` survives to the stored attributes column without a schema
migration. Lets dashboard span/metric views correlate by chat conversation
across multiple runs.

Closes TRI-9082.
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 new potential issues.

View 24 additional findings in Devin Review.

Open in Devin Review

Comment thread packages/trigger-sdk/src/v3/chat-server.ts
ericallam added 3 commits May 6, 2026 10:03
Action turns previously fell through to the regular turn machinery,
calling onTurnStart, run(), onTurnComplete, etc. — meaning every action
fired an LLM call by default. Customers worked around this with a
chat.store-based skipModelCall flag pattern (Graham at Arena).

Now actions fire hydrateMessages and onAction only. No onTurnStart,
prepareMessages, onBeforeTurnComplete, onTurnComplete; no run()
invocation; no turn-counter increment. The trace span is named
"chat action" instead of "chat turn N".

onAction widens to accept the same return shapes as run(): void
(side-effect-only, default), StreamTextResult (auto-piped as the
response), string, or UIMessage. Customers who want a model response
from an action return streamText(...) directly from onAction.

If an action arrives but no onAction handler is configured, console.warn
fires once and the action is ignored (vs. silently triggering run()
on a stale wire payload).

Closes TRI-9118.

BREAKING: customers who relied on actions auto-invoking run() must
move that logic into onAction. See the changeset for the migration
snippet.
Wire a typed  action on the main `aiChat` agent in
references/ai-chat — `actionSchema` accepts `{ type: "undo" }` and
`onAction` calls `chat.history.slice(0, -2)` to drop the last
user/assistant exchange. Adds an Undo button to the chat input row
that calls `transport.sendAction(chatId, { type: "undo" })` and
optimistically updates local `useChat` state via `setMessages`.

Exercises the new TRI-9118 action semantics end-to-end through the
demo UI: clicking Undo emits a `chat action` span (not `chat turn`),
fires only `onAction()`, no `run()` / `streamText` / turn lifecycle
hooks. The next message turn sees the truncated server-side history.
Update the sendAction docstrings on TriggerChatTransport (browser) and
AgentChat (server) to reflect TRI-9118: actions fire hydrateMessages
and onAction only — no run(), no turn lifecycle hooks. The returned
stream is empty for void onAction returns and carries the model
response when onAction returns a StreamTextResult.
ericallam added 3 commits May 7, 2026 09:46
- Sessions list mirrors the Runs list (ClickHouse-backed, filterable, cursor-paginated, derived ACTIVE/CLOSED/EXPIRED status).
- Session detail page: split-pane Conversation + Inspector with Overview/Runs/Metadata tabs, breadcrumb status combo, Close session action via a Remix resource route, dashboard-cookie-authed SSE for input/output streams.
- AgentView decoupled from a specific run — now subscribes via session-scoped SSE, so the same component renders on both run and session pages with identical streaming behavior.
- Run inspector adds a Session row (gated on AGENT-tagged runs) linking back to the owning session, mirroring the existing Batch row pattern.
- stress-emit chat.agent task added to the ai-chat reference for stress-testing the conversation UI.
- playground transport: route startSession through a ref so sidebar
  edits (tags, machine, maxAttempts, maxDuration, version, region) made
  before the first send aren't silently ignored. Mirrors the existing
  clientDataJsonRef pattern.
- chat.handover: swallow the rethrown error after the recovery path
  has already run so processes started with --unhandled-rejections=throw
  don't crash. User-facing behavior unchanged.
…sume

When the AI SDK regenerates the assistant message id on an
addToolOutput-driven HITL continuation, our id-merge in hydrateMessages
fails to attach the tool answer to the existing head — duplicating the
assistant in the accumulator. Reported by Arena AI, who maintains a
content-match workaround in their hydrateMessages keyed on tool_call_id.

Add a run-scoped `toolCallId -> head messageId` map populated whenever
an assistant message containing tool parts lands in the accumulator.
The submit-message id-merge now falls back to this map when id-match
fails: walk the incoming message's tool parts, look up by toolCallId,
rewrite the incoming id back to the recorded head, then retry id-match.

Could not reproduce on current AI SDK 6.0.116 (id is preserved through
addToolOutput) but ship the mapping so the merge stays robust against
older versions and edge cases we haven't observed. Customer-side
content-match workarounds become unnecessary.

Closes TRI-9137.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants