DeepSeek V4 is a new family of open-weight mixture-of-experts models released in preview on April 24, 2026. It ships in two sizes — V4-Pro at 1.6 trillion total parameters and V4-Flash at 284 billion — both with a 1 million token context window.

How much does DeepSeek V4 cost to use?

V4-Flash is priced at $0.14 per million input tokens and $0.28 per million output tokens. V4-Pro is $1.74 input and $3.48 output per million tokens, according to DeepSeek's published pricing.

How does V4 compare to GPT-5 and Gemini 3.1-Pro?

DeepSeek says V4-Pro leads open-source models on math and coding benchmarks and trails only Google's Gemini 3.1-Pro on world knowledge. The company describes the gap to frontier closed models as roughly three to six months.

DeepSeek V4 Preview Lands: 1.6T-Parameter Open Model With 1M Context, Flash Pricing at $0.14/M

Chinese AI lab DeepSeek on Friday released preview versions of DeepSeek-V4-Pro and DeepSeek-V4-Flash, the long-awaited follow-up to the reasoning model that rattled Silicon Valley in January 2025. The Hangzhou-based firm published both models under an MIT license with open weights, pairing a 1 million token context window with pricing that lands well below Western frontier providers.

The release lands almost exactly a year after DeepSeek's earlier R1 model triggered a global reassessment of how much capital and compute a competitive frontier model actually requires. With V4, the company is again betting that efficiency and openness, not just raw capability, are the pressure points on the incumbents.

Two sizes, one architecture

V4-Pro carries 1.6 trillion total parameters with roughly 49 billion active per token, using a mixture-of-experts design. V4-Flash is a leaner 284 billion parameters with about 13 billion active. Both share the same 1 million token context and are available as open weights, with published file sizes of 865GB for Pro and 160GB for Flash.

Performance, per DeepSeek's own disclosures, puts V4-Pro at the head of the open-weight pack — ahead of recent releases such as Kimi K2.6 and GLM-5.1 — while trailing closed frontier systems like GPT-5.4 and Gemini 3.1-Pro by what the team characterizes as roughly three to six months. The model "significantly leads other open-source models" in world-knowledge benchmarks, the company said in its WeChat announcement, with Gemini-Pro-3.1 the primary exception.

Aggressive pricing and agent focus

DeepSeek is charging $0.14 per million input tokens and $0.28 per million output tokens for V4-Flash, and $1.74 / $3.48 per million for V4-Pro. That positions Flash as one of the cheapest serving options for a long-context model at this capability tier.

The company explicitly optimized the models for agentic coding tools, naming Claude Code, OpenClaw, OpenCode, and CodeBuddy as targets for compatibility. DeepSeek also reports large efficiency gains versus V3.2: V4-Pro uses about 27% of the per-token compute and 10% of the KV-cache footprint at 1M-token contexts, with V4-Flash cutting further to roughly 10% and 7%.

Why it matters

The preview ships into a market where Western frontier labs have leaned on ever-larger capital commitments — multi-gigawatt data centers, multi-billion-dollar chip deals — to justify premium pricing. An open-weight model with a 1M context and sub-dollar per-million-token economics reframes that conversation, particularly for enterprise agent workloads where context length and unit cost dominate the bill.

It also reinforces a trend captured in Stanford's 2026 AI Index, which documented that the measured performance gap between leading Chinese and US models has narrowed to a few percentage points. DeepSeek's decision to release V4 openly, rather than behind a closed API, ensures the pressure propagates: every competitor that prices against proprietary frontier models now has a credible, downloadable alternative to benchmark against.

A final V4 release date was not disclosed.

DeepSeek V4 Preview Lands: 1.6T-Parameter Open Model With 1M Context, Flash Pricing at $0.14/M

Two sizes, one architecture

Aggressive pricing and agent focus

Why it matters

More in Models

OpenAI Releases GPT-5.5 'Spud', Tops Artificial Analysis Intelligence Index

OpenAI Ships Open-Weight Privacy Filter to Redact PII On Device

Moonshot Kimi K2.6 lands open-source, scales to 300 sub-agents and 4,000 coordinated steps