Back to stories
Models

Mistral Ships Medium 3.5 and Vibe Remote Agents: 128B Open-Weight Model Targets Async Coding

Michael Ouroumis2 min read
Mistral Ships Medium 3.5 and Vibe Remote Agents: 128B Open-Weight Model Targets Async Coding

Mistral pushed out one of the more aggressive open-weight launches of the year this week, releasing Medium 3.5 — a dense 128-billion-parameter model with a 256K context window — alongside Vibe, a new cloud platform for running coding agents asynchronously. The combination lands as European AI labs push to keep pace with U.S. and Chinese frontier releases while preserving permissive licensing.

The Paris-based startup is positioning Medium 3.5 as an enterprise-grade open-weight option that consolidates instruction-following, reasoning, and coding into a single set of weights, rather than shipping separate specialist models. According to Mistral's announcement, the model is available immediately through Vibe, Le Chat, the API, Hugging Face, and NVIDIA endpoints.

What Medium 3.5 brings

The headline numbers: a dense 128B parameter count, a 256K context window, configurable reasoning effort per request, and a custom-trained vision encoder that handles variable image sizes and aspect ratios. Mistral reports 77.6% on SWE-Bench Verified and 91.4 on τ³-Telecom, with the model replacing both Mistral Medium 3.1 and Magistral inside Le Chat.

The license matters as much as the benchmarks. Medium 3.5 ships under a Modified MIT license with open weights, and Mistral says self-hosting is feasible on as few as four GPUs — a deliberate pitch to enterprises that want frontier-class capability without committing to a closed API.

API pricing is set at $1.50 per million input tokens and $7.50 per million output tokens, slotting between low-cost commodity models and premium frontier offerings from OpenAI, Anthropic, and Google.

Vibe and the agent layer

The more strategically interesting piece may be Vibe. Rather than the synchronous chat-style coding agents that dominate developer tooling today, Vibe runs sessions asynchronously in cloud sandboxes — multiple agents working in parallel, with developers notified when tasks complete.

Mistral describes a "teleport" feature that lets developers move a local CLI session into the cloud while preserving session history, task state, and approvals. Vibe integrates with GitHub, Linear, Jira, Sentry, Slack, and Teams, putting it into direct competition with Claude Code, OpenAI Codex, Cursor, and a growing roster of agent-native developer platforms.

Why this matters

The release lands during an unusually crowded stretch for model launches. OpenAI shipped GPT-5.5 in late April, Anthropic continues iterating Claude Opus 4.7, Google rolled out Gemini 3.1 Ultra, and Chinese labs including DeepSeek and Tencent have pushed open-source competitors at much lower price points.

Mistral's bet is that a permissively licensed dense model — paired with an async agent runtime — gives regulated industries and sovereignty-conscious European buyers an alternative they can actually deploy on their own infrastructure. Whether enterprises adopt Vibe as the orchestration layer, or use Medium 3.5 weights inside their existing agent stacks, will determine how much of the package sticks.

For developers evaluating the offering today, the calculus is straightforward: a 128B dense model with a 256K context window and four-GPU self-hosting target is a meaningful new option in the open-weight tier — particularly for teams that have been hesitant to depend solely on closed frontier APIs.

Learn AI for Free — FreeAcademy.ai

Take "AI Essentials: Understanding AI in 2026" — a free course with certificate to master the skills behind this story.

More in Models

NVIDIA Launches Nemotron 3 Nano Omni: 30B Open Multimodal Model With 9x Throughput Edge
Models

NVIDIA Launches Nemotron 3 Nano Omni: 30B Open Multimodal Model With 9x Throughput Edge

NVIDIA released Nemotron 3 Nano Omni, a 30-billion-parameter open-weight model that unifies vision, audio, video, and text in one architecture, claiming up to 9x higher throughput than competing open omni models.

1 day ago2 min read
xAI Launches Grok Voice Think Fast 1.0, Tops τ-Voice Bench and Powers Starlink Support
Models

xAI Launches Grok Voice Think Fast 1.0, Tops τ-Voice Bench and Powers Starlink Support

xAI's new voice model scored 67.3% on the τ-voice Bench — well ahead of Gemini 3.1 Flash Live and GPT Realtime — and is now powering Starlink's phone sales and support with a 70% autonomous resolution rate.

5 days ago2 min read
Tencent Drops Hy3 Preview: 295B Open-Source MoE Model Kicks DeepSeek Out of Yuanbao
Models

Tencent Drops Hy3 Preview: 295B Open-Source MoE Model Kicks DeepSeek Out of Yuanbao

Tencent has open-sourced Hy3 Preview, a 295B/21B-activated mixture-of-experts model built in under three months. The Yuanbao chatbot is switching its primary engine from DeepSeek to the new in-house model.

1 week ago2 min read