Back to stories
Models

DeepSeek V4 Preview Lands: 1.6T-Parameter Open Model With 1M Context, Flash Pricing at $0.14/M

Michael Ouroumis2 min read
DeepSeek V4 Preview Lands: 1.6T-Parameter Open Model With 1M Context, Flash Pricing at $0.14/M

Chinese AI lab DeepSeek on Friday released preview versions of DeepSeek-V4-Pro and DeepSeek-V4-Flash, the long-awaited follow-up to the reasoning model that rattled Silicon Valley in January 2025. The Hangzhou-based firm published both models under an MIT license with open weights, pairing a 1 million token context window with pricing that lands well below Western frontier providers.

The release lands almost exactly a year after DeepSeek's earlier R1 model triggered a global reassessment of how much capital and compute a competitive frontier model actually requires. With V4, the company is again betting that efficiency and openness, not just raw capability, are the pressure points on the incumbents.

Two sizes, one architecture

V4-Pro carries 1.6 trillion total parameters with roughly 49 billion active per token, using a mixture-of-experts design. V4-Flash is a leaner 284 billion parameters with about 13 billion active. Both share the same 1 million token context and are available as open weights, with published file sizes of 865GB for Pro and 160GB for Flash.

Performance, per DeepSeek's own disclosures, puts V4-Pro at the head of the open-weight pack — ahead of recent releases such as Kimi K2.6 and GLM-5.1 — while trailing closed frontier systems like GPT-5.4 and Gemini 3.1-Pro by what the team characterizes as roughly three to six months. The model "significantly leads other open-source models" in world-knowledge benchmarks, the company said in its WeChat announcement, with Gemini-Pro-3.1 the primary exception.

Aggressive pricing and agent focus

DeepSeek is charging $0.14 per million input tokens and $0.28 per million output tokens for V4-Flash, and $1.74 / $3.48 per million for V4-Pro. That positions Flash as one of the cheapest serving options for a long-context model at this capability tier.

The company explicitly optimized the models for agentic coding tools, naming Claude Code, OpenClaw, OpenCode, and CodeBuddy as targets for compatibility. DeepSeek also reports large efficiency gains versus V3.2: V4-Pro uses about 27% of the per-token compute and 10% of the KV-cache footprint at 1M-token contexts, with V4-Flash cutting further to roughly 10% and 7%.

Why it matters

The preview ships into a market where Western frontier labs have leaned on ever-larger capital commitments — multi-gigawatt data centers, multi-billion-dollar chip deals — to justify premium pricing. An open-weight model with a 1M context and sub-dollar per-million-token economics reframes that conversation, particularly for enterprise agent workloads where context length and unit cost dominate the bill.

It also reinforces a trend captured in Stanford's 2026 AI Index, which documented that the measured performance gap between leading Chinese and US models has narrowed to a few percentage points. DeepSeek's decision to release V4 openly, rather than behind a closed API, ensures the pressure propagates: every competitor that prices against proprietary frontier models now has a credible, downloadable alternative to benchmark against.

A final V4 release date was not disclosed.

Learn AI for Free — FreeAcademy.ai

Take "AI Essentials: Understanding AI in 2026" — a free course with certificate to master the skills behind this story.

More in Models

OpenAI Releases GPT-5.5 'Spud', Tops Artificial Analysis Intelligence Index
Models

OpenAI Releases GPT-5.5 'Spud', Tops Artificial Analysis Intelligence Index

OpenAI launched GPT-5.5 on April 23, 2026. Codenamed 'Spud', the model scores 60 on the Artificial Analysis Intelligence Index, three points ahead of Claude Opus 4.7 and Gemini 3.1 Pro Preview.

16 hours ago3 min read
OpenAI Ships Open-Weight Privacy Filter to Redact PII On Device
Models

OpenAI Ships Open-Weight Privacy Filter to Redact PII On Device

OpenAI released Privacy Filter, a 1.5B-parameter Apache 2.0 open-weight model that detects and redacts personal data locally, hitting 96% F1 on a standard PII benchmark.

19 hours ago2 min read
Moonshot Kimi K2.6 lands open-source, scales to 300 sub-agents and 4,000 coordinated steps
Models

Moonshot Kimi K2.6 lands open-source, scales to 300 sub-agents and 4,000 coordinated steps

Moonshot AI shipped Kimi K2.6 as a generally available open-source model on April 20, posting 58.6 on SWE-Bench Pro — ahead of GPT-5.4 and Claude Opus 4.6 — while scaling agent swarms to 300 sub-agents and 4,000 coordinated steps.

2 days ago3 min read