Skip to content

Add assistant turn stats footer#2601

Open
coygeek wants to merge 4 commits intopingdotgg:mainfrom
coygeek:fix/session-stats-footer-metrics
Open

Add assistant turn stats footer#2601
coygeek wants to merge 4 commits intopingdotgg:mainfrom
coygeek:fix/session-stats-footer-metrics

Conversation

@coygeek
Copy link
Copy Markdown

@coygeek coygeek commented May 8, 2026

Summary

Closes #2518.

  • add a compact assistant turn stats footer with model/effort, elapsed time, output tokens, throughput, TTFT, and tool-call count when available
  • project token usage onto the latest completed turn so the footer survives reloads and avoids stale/unassigned context-window snapshots
  • derive Codex TTFT from first assistant output and completed throughput duration from active assistant response boundaries, excluding approval/user-input waits and preserving provider-supplied durations
  • keep unavailable metrics hidden instead of fabricating zero values

Verification

  • cd apps/server && bun run test -- src/orchestration/Layers/ProviderRuntimeIngestion.test.ts
  • cd apps/server && bun run test -- src/orchestration/Layers/ProviderRuntimeIngestion.test.ts src/orchestration/Layers/ProjectionPipeline.test.ts
  • cd apps/web && bun run test -- src/lib/turnStats.test.ts src/lib/contextWindow.test.ts
  • bun fmt
  • bun lint (passes with existing warnings only)
  • bun typecheck
  • bun run test
  • git diff --check

Live smoke

  • Reused the local app on http://localhost:5733/
  • Sent a short Codex turn via agent-browser
  • Confirmed one [data-assistant-turn-stats="true"] footer rendered with 16 tokens, 66.7 tok/sec, and Time-to-first: 7.2 sec
  • Confirmed SQLite persisted the same latest turn with durationMs: 240, timeToFirstTokenMs: 7216, and lastOutputTokens: 16, so completed TPS is not using the old sub-10ms final-delta denominator

Note

Medium Risk
Touches provider runtime ingestion, event schema, and thread projection state, which could affect persisted turn pointers and token-usage activity data if the derived timing logic or new turnId plumbing is incorrect. UI changes are low risk but depend on the new backend-projected metrics.

Overview
Adds an assistant turn stats footer to assistant messages in the chat timeline, showing model/effort, elapsed time, output tokens, throughput, TTFT, and tool-call count only when those metrics are available.

On the backend, token-usage activities now support timeToFirstTokenMs and Codex token-usage events carry a turnId/providerTurnId. When provider durations are missing, ingestion derives assistant-only generation duration and TTFT from Codex assistant stream boundaries while excluding tool/approval/user-input gaps, and persists these values so the footer survives reloads.

Also fixes thread projection updates to preserve latestTurnId when a thread.session-set clears activeTurnId, and updates proposed-plan exports to prefix generated filenames with plan- (without double-prefixing).

Reviewed by Cursor Bugbot for commit 89169c2. Bugbot is set up for automated code reviews on this repo. Configure here.

Note

Add assistant turn stats footer to chat messages with duration, throughput, and tool call metrics

  • Introduces a TurnStatsFooter component that renders a compact stats row (model, elapsed time, tokens, throughput, TTFT, tool calls) beneath completed assistant messages in the chat timeline.
  • Adds turnStats.ts with utilities to derive, format, and assemble these metrics from thread activities and context-window snapshots, including fallback logic for unassigned snapshots and tool call counting.
  • Extends the ingestion layer in ProviderRuntimeIngestion.ts to track assistant response segment timing per turn, deriving durationMs and timeToFirstTokenMs and merging them into context-window.updated activities when provider data is absent.
  • Adds timeToFirstTokenMs to ThreadTokenUsageSnapshot and propagates turnId/providerTurnId through thread.token-usage.updated events from the Codex adapter.
  • Fixes thread.session-set projection to retain latestTurnId when the incoming event has a null activeTurnId.

Macroscope summarized 89169c2.

Show compact assistant turn stats with model, elapsed time, token count, throughput, TTFT, and tool-call count when those fields are available. Derive Codex timing from response boundaries so completed throughput does not use tiny final delta gaps as the denominator.

Closes pingdotgg#2518
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 8, 2026

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 3ccf90eb-a923-482a-b755-8420d1e35661

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added size:XL 500-999 changed lines (additions + deletions). vouch:unvouched PR author is not yet trusted in the VOUCHED list. labels May 8, 2026
Comment thread apps/web/src/lib/turnStats.ts Outdated
@macroscopeapp
Copy link
Copy Markdown
Contributor

macroscopeapp Bot commented May 8, 2026

Approvability

Verdict: Needs human review

This PR introduces a new user-facing feature (assistant turn stats footer) with new UI components, server-side timing state management, and non-trivial derivation logic. New features warrant human review, and there's also an unresolved comment about potential TTFT overwriting in the fallback path.

You can customize Macroscope's approvability policy. Learn more.

Comment thread apps/web/src/lib/turnStats.ts
@juliusmarminge
Copy link
Copy Markdown
Member

juliusmarminge commented May 8, 2026

CleanShot 2026-05-08 at 09 51 45@2x

@coygeek
Copy link
Copy Markdown
Author

coygeek commented May 8, 2026

image

Copy link
Copy Markdown
Member

@juliusmarminge juliusmarminge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you aren't counting TPS here, you're taking the full turn duration / turn token count which doesn't take tool calls into account so the metric is stupid? forgie me if i misunderstand

Close Codex assistant response timing segments when tool work starts so fallback generation duration does not include command or file-change output time. Add regression coverage for a long command gap and clarify the TPS tooltip.

Refs: pingdotgg#2518
@coygeek
Copy link
Copy Markdown
Author

coygeek commented May 9, 2026

Good catch. The UI was not directly dividing by the whole turn elapsed time, but the Codex fallback timing could still include tool execution wall time in the assistant generation duration.

I pushed 89169c22 to address that. The backend now closes the active assistant response timing segment as soon as tool work starts (item.started / item.updated tool lifecycle events, plus command/file-change output streams), so derived TPS no longer treats command execution time as model-writing time.

I also added a regression test for the specific failure mode: assistant text starts, a long command runs, then token usage is emitted. Before the fix, that case reported 102000ms of assistant generation duration; after the fix it reports 1000ms. The TPS tooltip now clarifies that derived timings exclude tool time.

Verified locally:

  • bun fmt
  • cd apps/server && bun run test src/orchestration/Layers/ProviderRuntimeIngestion.test.ts -t "excludes command execution wall time"
  • cd apps/web && bun run test src/lib/turnStats.test.ts
  • bun lint (13 existing warnings, 0 errors)
  • bun typecheck
  • local tmux dev build + Computer Use smoke test in Brave

The live smoke test with three separate commands and a longer report response showed: gpt-5.4 (Medium) · 34 sec · 471 tokens · 14.2 tok/sec · Time-to-first: 28 sec · 3 tool calls, which matches the intended accounting.

Copy link
Copy Markdown
Contributor

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 89169c2. Configure here.

...(assistantTimeToFirstTokenMs !== undefined
? { timeToFirstTokenMs: assistantTimeToFirstTokenMs }
: {}),
};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Derived TTFT overwrites provider-supplied TTFT when duration missing

Low Severity

buildContextWindowActivityPayload computes providerTimeToFirstTokenMs from the usage payload but never guards against overwriting it in the fallback branch. When a provider supplies timeToFirstTokenMs without durationMs, the final spread { ...usage, ...(assistantTimeToFirstTokenMs !== undefined ? { timeToFirstTokenMs: assistantTimeToFirstTokenMs } : {}) } overwrites the provider's value with the derived one. The first branch correctly preserves provider values, but the else-branch does not check providerTimeToFirstTokenMs before merging.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 89169c2. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XL 500-999 changed lines (additions + deletions). vouch:unvouched PR author is not yet trusted in the VOUCHED list.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Add compact per-turn session stats footer

2 participants