Skip to content

chore(deepinfra): add 45 new models and update pricing#1687

Open
RioPlay wants to merge 9 commits intoanomalyco:devfrom
RioPlay:dev
Open

chore(deepinfra): add 45 new models and update pricing#1687
RioPlay wants to merge 9 commits intoanomalyco:devfrom
RioPlay:dev

Conversation

@RioPlay
Copy link
Copy Markdown

@RioPlay RioPlay commented May 2, 2026

Summary

Added 45 new model definitions and updated pricing across all DeepInfra models to match current deepinfra.com listings.

Models Added (45)

DeepSeek (7): R1-0528-Turbo, R1-Distill-Llama-70B, V3, V3-0324, V3.1, V3.1-Terminus, V3.2, V4-Flash
Qwen (17): 3-14B, 3-30B-A3B, 3-32B, 3-Coder-480B-Instruct, 3-Next-80B-A3B, 3-VL-30B-A3B, 3-VL-235B-A22B, 3-235B-A22B-Instruct-2507, 3-235B-A22B-Thinking-2507, 3.5 (0.8B, 2B, 4B, 9B, 27B, 35B-A3B, 122B-A10B, 397B-A17B), 3.6 (27B, 35B-A3B)
Google (5): gemma-3 (4B, 12B, 27B), gemma-4 (26B-A4B, 31B)
NVIDIA (5): Llama-3.3-Nemotron-Super-49B-v1.5, Nemotron-3-Nano-30B-A3B, Nemotron-3-Nano-Omni-30B-A3B-Reasoning, NVIDIA-Nemotron-3-Super-120B-A12B, NVIDIA-Nemotron-Nano-9B-v2
Meta (2): Llama-3.2-11B-Vision-Instruct, Llama-Guard-4-12B
Mistral (3): Mistral-Nemo-Instruct-2407, Mistral-Small-24B-Instruct-2501, Mistral-Small-3.2-24B-Instruct-2506
Moonshot (3): Kimi-K2.5, Kimi-K2.6, Kimi-K2-Instruct-0905
GLM (2 new files): GLM-4.5, GLM-4.6, GLM-4.6V, GLM-4.7, GLM-4.7-Flash, GLM-5, GLM-5.1
Others (5): phi-4, Step-3.5-Flash, Hermes-3-Llama-3.1-405B, Hermes-3-Llama-3.1-70B, MythoMax-L2-13b, L3-8B-Lunaris-v1-Turbo, L3.1-70B-Euryale-v2.2, gpt-oss-120b, gpt-oss-120b-Turbo, gpt-oss-20b

Pricing Updates

  • GLM-5.1: input 1.4→1.05, output 4.4→3.50, cache 0.26→0.205
  • GLM-4.7: input 0.43→0.40
  • GLM-5: input 0.8→0.60, output 2.56→2.08, cache 0.16→0.12
  • MiniMax-M2.5: input 0.27→0.15, output 0.95→1.15
  • MiniMax-M2: updated input/output
  • Kimi-K2.5: input 0.50→0.45, output 2.80→2.25, cache 0.09→0.07
  • Qwen3.5-35B-A3B: input 0.2→0.18, output 0.95→1.00
  • Qwen3.6-35B-A3B: input 0.20→0.15, output 1.00→0.95
  • Qwen3-Coder-480B-Turbo: input 0.3→0.30, output 1.2→1.00
  • gpt-oss-120b: input 0.05→0.039, output 0.24→0.19

Fixes

  • Renamed invalid cache fields (cached, cached_input, cached_read) → cache_read across 4 files (deepinfra)
  • Added missing cache_read for GLM-4.7-Flash and MiniMax-M2.5
  • Added missing # https://deepinfra.com/... comment headers
  • Fixed trailing whitespace and missing trailing newlines on all files

NOTE:
** AI was used to complete this task **

RioPlay added 8 commits May 2, 2026 02:07
- Update pricing on GLM-5, MiniMax-M2.5, Kimi-K2.5, Qwen3.5-35B-A3B,
  Qwen3.6-35B-A3B, Qwen3-Coder-480B-A35B-Instruct-Turbo to match
  deepinfra.com current rates
- Add 5 nvidia models (Nemotron 3 series)
- Add 7 deepseek-ai models (V4-Flash, V3 variants, R1 variants)
- Add 5 google models (gemma 4 and gemma 3 series)
- Add 13 Qwen models (Qwen3.5, Qwen3.6, Qwen3, Qwen3-VL, Qwen2.5)
- Add Step-3.5-Flash, 2 meta-llama, 1 phi-4, 3 mistralai
- Add gpt-oss-120b-Turbo, 2 NousResearch, Gryphe, 2 Sao10K
@RioPlay RioPlay closed this May 2, 2026
@RioPlay RioPlay reopened this May 2, 2026
@RioPlay
Copy link
Copy Markdown
Author

RioPlay commented May 2, 2026

Only reason I reverted the vercel item was because it's out of scope for this particular submission.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant