feat(providers): default DeepSeek to V4-Pro (1M context, Opus-tier coding) by AmrDab · Pull Request #83 · AmrDab/clawdcursor

AmrDab · 2026-05-07T06:35:45Z

Summary

Default DeepSeek text model bumped from deepseek-chat (V3-era, 64k ctx) → deepseek-v4-pro (1.6T MoE, 1M ctx, Claude-Opus-tier coding).
Context window in the provider profile updated 64,000 → 1,000,000 to match V4's actual ceiling.
Provider profile shape is unchanged: still OpenAI-compatible, same baseUrl, same auth header — no new code paths, no new env vars, no new dependencies.

Why Pro by default

DeepSeek V4 launched 2026-04-24 with two MoE variants:

	params (total / active)	context	mode
`deepseek-v4-pro`	1.6T / 49B	1M	thinking + non-thinking
`deepseek-v4-flash`	284B / 13B	1M	thinking + non-thinking

Pro is roughly Claude-Opus-tier on coding benchmarks at ~7× lower price, and tool-use reliability matters more than per-token cost for an agent loop. Budget-conscious users can still flip to Flash via --text-model deepseek-v4-flash or .clawdcursor-config.json.

The legacy names deepseek-chat and deepseek-reasoner deprecate 2026-07-24 and currently route to V4-Flash anyway, so leaving the default at deepseek-chat would silently surprise users at the deprecation cliff.

What this PR does NOT do

No new MODEL_QUIRKS entry. V4 thinking mode silently ignores temperature rather than rejecting it (unlike legacy deepseek-reasoner, which 400s on temperature ≠ 1). The existing request flow works as-is for V4. A quirk for legacy deepseek-reasoner would be additive and is tracked as a follow-up once fix(agent): five model-robustness bugs (parser, quirks, cannot_read, DPI, safety bypass) #82's MODEL_QUIRKS infra lands on main.
No native vision. DeepSeek still ships text-only. visionModel keeps the text-model name as a placeholder so the provider profile shape stays uniform; real vision needs a mixed pipeline (text=DeepSeek, vision=Anthropic / GPT-4o / Gemini Flash).

Test plan

npm run typecheck — clean.
npm run lint — 0 errors (64 pre-existing warnings unchanged).
npm test — 434/435 passing (1 pre-existing skip), no diff in failure pattern.
Manual smoke test against the live DeepSeek API once a user provides a DEEPSEEK_API_KEY (the provider plumbing is unchanged, so this is just verifying the new model name is accepted by the endpoint).

Follow-ups

Add MODEL_QUIRKS entry for deepseek-reasoner (legacy R1 rejects temperature ≠ 1) — depends on fix(agent): five model-robustness bugs (parser, quirks, cannot_read, DPI, safety bypass) #82 merging first.
Once a vision-capable DeepSeek lands, update the visionModel field and drop the placeholder comment.

🤖 Generated with Claude Code

…ding) DeepSeek V4 launched 2026-04-24 with two MoE models — `deepseek-v4-pro` (1.6T params / 49B activated) and `deepseek-v4-flash` (284B / 13B). Both ship with a 1M-token context window and dual thinking/non-thinking modes over the existing OpenAI-compatible interface, so the integration cost is just a model-name + context-window bump. Why default to Pro: • Pro is Claude-Opus-tier on coding benchmarks at ~7× lower price, and tool-use reliability matters more than per-token cost for an agent loop. Users on a budget can override with `--text-model deepseek-v4-flash`. • The legacy names `deepseek-chat` and `deepseek-reasoner` deprecate 2026-07-24, and currently route to V4-Flash anyway. Bumping the default now avoids a silent surprise at deprecation. What this does NOT change: • DeepSeek still has no native vision model. `visionModel` keeps the placeholder text-model name so the provider profile stays uniform; real vision needs a mixed pipeline (text=DeepSeek, vision=Anthropic or GPT-4o or Gemini Flash). • No new MODEL_QUIRKS entries. V4 thinking mode silently ignores `temperature` rather than rejecting it (unlike legacy `deepseek- reasoner`), so the existing request flow works as-is. A quirk for legacy `deepseek-reasoner` would be additive and is tracked for follow-up once PR #82's MODEL_QUIRKS infra lands on main. Validation: • typecheck clean, lint 0 errors (64 pre-existing warnings unchanged), 434/435 tests passing (1 pre-existing skip). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(providers): default DeepSeek to V4-Pro (1M context, Opus-tier coding)#83

feat(providers): default DeepSeek to V4-Pro (1M context, Opus-tier coding)#83
AmrDab wants to merge 1 commit intomainfrom
claude/deepseek-v4-support

AmrDab commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

AmrDab commented May 7, 2026

Summary

Why Pro by default

What this PR does NOT do

Test plan

Follow-ups

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant