feat(tui): show output token throughput#36
Conversation
🦋 Changeset detectedLatest commit: 4f65ae0 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 33194ea4df
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // Starts or updates a rendered tool call from a tool-call start event. | ||
| private handleToolCall(event: ToolCallStartedEvent): void { | ||
| this.flushStreamingUiUpdatesNow(); | ||
| this.state.currentStepModelFinishedAtMs ??= Date.now(); |
There was a problem hiding this comment.
Capture model-finish time before tool preflight latency
Seting currentStepModelFinishedAtMs on tool.call.started makes the throughput calculation include non-model delays for tool steps. In this codebase, tool.call.started is emitted only after tool preflight/prepare work (including hooks and potential approval waits) completes, so updateOutputTokenThroughput can divide by a much larger elapsed time and report misleadingly low tok/s whenever a tool call is delayed before dispatch. This regresses the accuracy of the new footer speed metric specifically for tool-calling turns.
Useful? React with 👍 / 👎.
| const tokensPerSecond = outputTokensPerSecond( | ||
| usage, | ||
| this.state.currentStepStartedAtMs, | ||
| endedAtMs, |
There was a problem hiding this comment.
Exclude retry backoff time from throughput denominator
updateOutputTokenThroughput always divides by elapsed wall-clock time since turn.step.started, but turn.step.completed.usage reflects only the successful LLM attempt. When a step retries (network/provider retryable errors), this elapsed window includes retry sleeps and failed attempts, which can drastically underreport tok/s even though model generation speed was normal. The metric therefore becomes misleading in any retried step.
Useful? React with 👍 / 👎.
|
Addressed in 4f65ae0 by deriving throughput from the observed model streaming window instead of step wall-clock time or The footer now records the first/last assistant, thinking, or tool-call argument delta for the current step, then computes output tok/s from that window once usage arrives. That excludes retry backoff, tool preflight/approval, and tool execution time. If no model deltas were observed, it skips the speed display rather than showing a misleading value. Added regression coverage for delayed tool-step completion and the no-delta fallback. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4f65ae0fc6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const startedAtMs = this.state.currentStepFirstDeltaAtMs; | ||
| const endedAtMs = this.state.currentStepLastDeltaAtMs; | ||
| if (startedAtMs === undefined || endedAtMs === undefined) return; |
There was a problem hiding this comment.
Use step-end time when only one delta is observed
This throughput path uses currentStepLastDeltaAtMs as the end timestamp, so when a step emits exactly one model delta (a common pattern in this repo’s agent-core event snapshots), startedAtMs and endedAtMs are identical and outputTokensPerSecond returns null. In that case the new footer speed metric silently disappears even though turn.step.completed.usage.output is positive, so short or single-chunk responses never show a speed value.
Useful? React with 👍 / 👎.
|
Thank you for your interest in contributing to Kimi Code. For new features, please create an issue or discuss it in an existing issue. We are not currently accepting pull requests for new features. |
Related Issue
Resolve #35
Problem
See linked issue. Kimi Code did not surface the model output token throughput that users can see in kimi-cli.
What changed
Checklist
gen-changesetsskill, or this PR needs no changeset.gen-docsskill, or this PR needs no doc update.