Back to stories
Models

DeepSeek V4 Drops This Week — A Trillion-Parameter Multimodal Model Trained on Chinese Chips

Michael Ouroumis2 min read
DeepSeek V4 Drops This Week — A Trillion-Parameter Multimodal Model Trained on Chinese Chips

DeepSeek V4 Drops This Week — A Trillion-Parameter Multimodal Model Trained on Chinese Chips

DeepSeek, the Hangzhou-based AI lab that shook the industry with its R1 reasoning model in January 2025, is preparing to release its most ambitious model yet. V4 is a natively multimodal system capable of generating text, images, and video — and it was built on Chinese-made silicon.

What Makes V4 Different

Unlike previous models that bolted vision capabilities onto text-only foundations, V4 was trained on text, image, video, and audio data simultaneously from the ground up. This native multimodality means the model does not treat images as an afterthought — it reasons across modalities as a single integrated system.

The numbers are significant. V4 features roughly one trillion total parameters with approximately 32 billion active per forward pass, a 50% increase in total model size over V3.2 while actually reducing active parameters from 37 billion. The context window jumps to one million tokens, a major leap that positions V4 for enterprise document processing and long-form code generation.

The Hardware Story

The most geopolitically charged detail: DeepSeek partnered with Huawei and Cambricon to optimize V4 for their latest AI chips. Despite US export restrictions on advanced semiconductors to China, DeepSeek has demonstrated that Chinese hardware can support frontier model training. Whether V4's performance holds up against models trained on NVIDIA's latest GPUs will be the benchmark that matters most.

What It Targets

Internal testing suggests V4 is optimized primarily for coding and long-context software engineering tasks. DeepSeek claims it could outperform Claude and ChatGPT on long-context coding benchmarks — a claim the community will verify within days of release.

The model will be released under an open-source license, continuing DeepSeek's strategy of undercutting Western labs on both price and accessibility. For developers already running DeepSeek R2 locally, V4 represents a significant upgrade path.

Why It Matters

V4 is not just another model release. It is a proof point that frontier AI development can happen outside the NVIDIA ecosystem, that open-source multimodal models can compete with proprietary ones, and that China's AI labs are not slowing down despite regulatory pressure. The timing — coinciding with China's Two Sessions parliamentary meetings — underscores the strategic significance Beijing places on domestic AI capabilities.

The AI community should have weights in hand within days.

Learn AI for Free — FreeAcademy.ai

Take "AI Essentials: Understanding AI in 2026" — a free course with certificate to master the skills behind this story.

More in Models

NVIDIA Launches Ising: Open-Source AI Models to Make Quantum Computers Useful
Models

NVIDIA Launches Ising: Open-Source AI Models to Make Quantum Computers Useful

NVIDIA unveiled Ising, its first family of open-source AI models for quantum computing, promising 2.5x faster error correction and slashing calibration time from days to hours.

2 days ago2 min read
OpenAI Retires Six Older Codex Models Including GPT-5 and GPT-5.1
Models

OpenAI Retires Six Older Codex Models Including GPT-5 and GPT-5.1

OpenAI today removes six legacy Codex models from its ChatGPT sign-in flow, consolidating around the newer GPT-5.3 and GPT-5.4 families and nudging developers toward API-based workflows.

2 days ago2 min read
GLM-5.1 Cracks Code Arena Top 3, First Open-Weight Model to Do So
Models

GLM-5.1 Cracks Code Arena Top 3, First Open-Weight Model to Do So

Z.ai's GLM-5.1 posted a 1530 Elo score on Code Arena this week, becoming the first open-weight model to break into the global top three — trailing only Anthropic's Claude Opus 4.6 variants.

4 days ago2 min read