Skip to content

fix: Token limits for Amazon Bedrock Kimi K2 models#1637

Open
msukmanowsky wants to merge 2 commits intoanomalyco:devfrom
elvexai:fix/amazon-bedrock-kimi-token-limits
Open

fix: Token limits for Amazon Bedrock Kimi K2 models#1637
msukmanowsky wants to merge 2 commits intoanomalyco:devfrom
elvexai:fix/amazon-bedrock-kimi-token-limits

Conversation

@msukmanowsky
Copy link
Copy Markdown

@msukmanowsky msukmanowsky commented Apr 28, 2026

Summary

Corrects the context window and max output token limits for two Kimi K2 models on Amazon Bedrock:

  • moonshot.kimi-k2-thinking
  • moonshotai.kimi-k2.5

Changes

Field Before After
limit.context 256,000 262,143
limit.output 256,000 16,000

The previous values appeared to be placeholders. The corrected values match the actual limits advertised by Amazon Bedrock for these models.

https://docs.aws.amazon.com/bedrock/latest/userguide/model-card-moonshot-ai-kimi-k2-5.html
https://docs.aws.amazon.com/bedrock/latest/userguide/model-card-moonshot-ai-kimi-k2-thinking.html

Correct context and output limits for moonshot.kimi-k2-thinking and
moonshotai.kimi-k2.5 on Amazon Bedrock:
- context: 256_000 → 262_143
- output: 256_000 → 16_000
@rekram1-node
Copy link
Copy Markdown
Collaborator

mind linking doc to that?

@msukmanowsky
Copy link
Copy Markdown
Author

@rekram1-node added links in the comments but also for your reference:

https://docs.aws.amazon.com/bedrock/latest/userguide/model-card-moonshot-ai-kimi-k2-5.html
https://docs.aws.amazon.com/bedrock/latest/userguide/model-card-moonshot-ai-kimi-k2-thinking.html

Neither actually specify the exact output limit so I've left it at 16k. Context window was validated empirically via manual testing.

@msukmanowsky msukmanowsky changed the title Fix token limits for Amazon Bedrock Kimi K2 models fix: Token limits for Amazon Bedrock Kimi K2 models Apr 29, 2026
@rekram1-node
Copy link
Copy Markdown
Collaborator

Screenshot 2026-04-29 at 10 07 12 AM

Thanks, and I think they do specify output limit see pic

@msukmanowsky
Copy link
Copy Markdown
Author

@rekram1-node they do, but I was referring to the lack of specificity. I highly doubt it's actually 16,000 and more likely something like 16,384 (2**14). We may be a little conservative here, but better than the status quo where it assumed you can use the full context for output.

@msukmanowsky
Copy link
Copy Markdown
Author

@rekram1-node bumping this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants