Skip to content

feat(tui): show output token throughput#36

Closed
Randy-sin wants to merge 2 commits into
MoonshotAI:mainfrom
Randy-sin:feat/footer-token-throughput
Closed

feat(tui): show output token throughput#36
Randy-sin wants to merge 2 commits into
MoonshotAI:mainfrom
Randy-sin:feat/footer-token-throughput

Conversation

@Randy-sin
Copy link
Copy Markdown

Related Issue

Resolve #35

Problem

See linked issue. Kimi Code did not surface the model output token throughput that users can see in kimi-cli.

What changed

  • Track the elapsed time for each model step and compute output tokens per second from reported usage.
  • Show the most recent output speed in the footer next to the context usage line.
  • Keep transient footer hints higher priority so exit/cancel hints can still replace the speed line temporarily.
  • Add tests for throughput calculation and footer rendering.

Checklist

  • I have read the CONTRIBUTING document.
  • I have linked a related issue, or explained the problem above.
  • I have added tests that prove my feature works.
  • Ran gen-changesets skill, or this PR needs no changeset.
  • Ran gen-docs skill, or this PR needs no doc update.

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 25, 2026

🦋 Changeset detected

Latest commit: 4f65ae0

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@moonshot-ai/kimi-code Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 33194ea4df

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread apps/kimi-code/src/tui/kimi-tui.ts Outdated
// Starts or updates a rendered tool call from a tool-call start event.
private handleToolCall(event: ToolCallStartedEvent): void {
this.flushStreamingUiUpdatesNow();
this.state.currentStepModelFinishedAtMs ??= Date.now();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Capture model-finish time before tool preflight latency

Seting currentStepModelFinishedAtMs on tool.call.started makes the throughput calculation include non-model delays for tool steps. In this codebase, tool.call.started is emitted only after tool preflight/prepare work (including hooks and potential approval waits) completes, so updateOutputTokenThroughput can divide by a much larger elapsed time and report misleadingly low tok/s whenever a tool call is delayed before dispatch. This regresses the accuracy of the new footer speed metric specifically for tool-calling turns.

Useful? React with 👍 / 👎.

Comment on lines +2471 to +2474
const tokensPerSecond = outputTokensPerSecond(
usage,
this.state.currentStepStartedAtMs,
endedAtMs,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Exclude retry backoff time from throughput denominator

updateOutputTokenThroughput always divides by elapsed wall-clock time since turn.step.started, but turn.step.completed.usage reflects only the successful LLM attempt. When a step retries (network/provider retryable errors), this elapsed window includes retry sleeps and failed attempts, which can drastically underreport tok/s even though model generation speed was normal. The metric therefore becomes misleading in any retried step.

Useful? React with 👍 / 👎.

@Randy-sin
Copy link
Copy Markdown
Author

Addressed in 4f65ae0 by deriving throughput from the observed model streaming window instead of step wall-clock time or tool.call.started.

The footer now records the first/last assistant, thinking, or tool-call argument delta for the current step, then computes output tok/s from that window once usage arrives. That excludes retry backoff, tool preflight/approval, and tool execution time. If no model deltas were observed, it skips the speed display rather than showing a misleading value.

Added regression coverage for delayed tool-step completion and the no-delta fallback.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4f65ae0fc6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +2465 to +2467
const startedAtMs = this.state.currentStepFirstDeltaAtMs;
const endedAtMs = this.state.currentStepLastDeltaAtMs;
if (startedAtMs === undefined || endedAtMs === undefined) return;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Use step-end time when only one delta is observed

This throughput path uses currentStepLastDeltaAtMs as the end timestamp, so when a step emits exactly one model delta (a common pattern in this repo’s agent-core event snapshots), startedAtMs and endedAtMs are identical and outputTokensPerSecond returns null. In that case the new footer speed metric silently disappears even though turn.step.completed.usage.output is positive, so short or single-chunk responses never show a speed value.

Useful? React with 👍 / 👎.

@liruifengv
Copy link
Copy Markdown
Collaborator

Thank you for your interest in contributing to Kimi Code.

For new features, please create an issue or discuss it in an existing issue. We are not currently accepting pull requests for new features.

@liruifengv liruifengv closed this May 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

缺少token吞吐速度限时

2 participants