Back to stories
Models

DeepSeek V4 Drops This Week — A Trillion-Parameter Multimodal Model Trained on Chinese Chips

Michael Ouroumis2 min read
DeepSeek V4 Drops This Week — A Trillion-Parameter Multimodal Model Trained on Chinese Chips

DeepSeek V4 Drops This Week — A Trillion-Parameter Multimodal Model Trained on Chinese Chips

DeepSeek, the Hangzhou-based AI lab that shook the industry with its R1 reasoning model in January 2025, is preparing to release its most ambitious model yet. V4 is a natively multimodal system capable of generating text, images, and video — and it was built on Chinese-made silicon.

What Makes V4 Different

Unlike previous models that bolted vision capabilities onto text-only foundations, V4 was trained on text, image, video, and audio data simultaneously from the ground up. This native multimodality means the model does not treat images as an afterthought — it reasons across modalities as a single integrated system.

The numbers are significant. V4 features roughly one trillion total parameters with approximately 32 billion active per forward pass, a 50% increase in total model size over V3.2 while actually reducing active parameters from 37 billion. The context window jumps to one million tokens, a major leap that positions V4 for enterprise document processing and long-form code generation.

The Hardware Story

The most geopolitically charged detail: DeepSeek partnered with Huawei and Cambricon to optimize V4 for their latest AI chips. Despite US export restrictions on advanced semiconductors to China, DeepSeek has demonstrated that Chinese hardware can support frontier model training. Whether V4's performance holds up against models trained on NVIDIA's latest GPUs will be the benchmark that matters most.

What It Targets

Internal testing suggests V4 is optimized primarily for coding and long-context software engineering tasks. DeepSeek claims it could outperform Claude and ChatGPT on long-context coding benchmarks — a claim the community will verify within days of release.

The model will be released under an open-source license, continuing DeepSeek's strategy of undercutting Western labs on both price and accessibility. For developers already running DeepSeek R2 locally, V4 represents a significant upgrade path.

Why It Matters

V4 is not just another model release. It is a proof point that frontier AI development can happen outside the NVIDIA ecosystem, that open-source multimodal models can compete with proprietary ones, and that China's AI labs are not slowing down despite regulatory pressure. The timing — coinciding with China's Two Sessions parliamentary meetings — underscores the strategic significance Beijing places on domestic AI capabilities.

The AI community should have weights in hand within days.

More in Models

Microsoft Releases Phi-4-Reasoning-Vision-15B: A Small Model That Knows When to Think
Models

Microsoft Releases Phi-4-Reasoning-Vision-15B: A Small Model That Knows When to Think

Microsoft open-sources Phi-4-reasoning-vision-15B, a compact 15B-parameter multimodal model that selectively activates chain-of-thought reasoning and rivals models many times its size.

8 hours ago2 min read
Anthropic Releases Claude Opus 4.6 — Its Most Capable Agentic Coding Model
Models

Anthropic Releases Claude Opus 4.6 — Its Most Capable Agentic Coding Model

Anthropic launches Claude Opus 4.6, a frontier model purpose-built for autonomous coding agents that can plan, execute, and debug multi-file projects with minimal human oversight.

1 day ago2 min read
Meta Releases Llama 4 Maverick With 400B Parameters Under Open Weights
Models

Meta Releases Llama 4 Maverick With 400B Parameters Under Open Weights

Meta releases Llama 4 Maverick, a 400-billion parameter mixture-of-experts model under its open weights license, matching GPT-5 on key benchmarks and reigniting the open-source AI debate.

1 day ago2 min read