How much does Mistral Medium 3.5 cost via API?

API pricing is $1.50 per million input tokens and $7.50 per million output tokens, with the same weights also downloadable from Hugging Face for self-hosting.

What are Vibe Remote Agents?

Vibe is Mistral's new cloud platform that runs coding agents asynchronously in parallel sandboxes, integrating with GitHub, Linear, Jira, Sentry, Slack, and Teams, and lets developers 'teleport' local CLI sessions to the cloud while preserving state.

Mistral Ships Medium 3.5 and Vibe Remote Agents: 128B Open-Weight Model Targets Async Coding

Q: What is Mistral Medium 3.5?

Medium 3.5 is a dense 128-billion-parameter open-weight model from Mistral with a 256K context window, configurable reasoning effort, and a Modified MIT license that allows self-hosting on as few as four GPUs.

Mistral pushed out one of the more aggressive open-weight launches of the year this week, releasing Medium 3.5 — a dense 128-billion-parameter model with a 256K context window — alongside Vibe, a new cloud platform for running coding agents asynchronously. The combination lands as European AI labs push to keep pace with U.S. and Chinese frontier releases while preserving permissive licensing.

The Paris-based startup is positioning Medium 3.5 as an enterprise-grade open-weight option that consolidates instruction-following, reasoning, and coding into a single set of weights, rather than shipping separate specialist models. According to Mistral's announcement, the model is available immediately through Vibe, Le Chat, the API, Hugging Face, and NVIDIA endpoints.

What Medium 3.5 brings

The headline numbers: a dense 128B parameter count, a 256K context window, configurable reasoning effort per request, and a custom-trained vision encoder that handles variable image sizes and aspect ratios. Mistral reports 77.6% on SWE-Bench Verified and 91.4 on τ³-Telecom, with the model replacing both Mistral Medium 3.1 and Magistral inside Le Chat.

The license matters as much as the benchmarks. Medium 3.5 ships under a Modified MIT license with open weights, and Mistral says self-hosting is feasible on as few as four GPUs — a deliberate pitch to enterprises that want frontier-class capability without committing to a closed API.

API pricing is set at $1.50 per million input tokens and $7.50 per million output tokens, slotting between low-cost commodity models and premium frontier offerings from OpenAI, Anthropic, and Google.

Vibe and the agent layer

The more strategically interesting piece may be Vibe. Rather than the synchronous chat-style coding agents that dominate developer tooling today, Vibe runs sessions asynchronously in cloud sandboxes — multiple agents working in parallel, with developers notified when tasks complete.

Mistral describes a "teleport" feature that lets developers move a local CLI session into the cloud while preserving session history, task state, and approvals. Vibe integrates with GitHub, Linear, Jira, Sentry, Slack, and Teams, putting it into direct competition with Claude Code, OpenAI Codex, Cursor, and a growing roster of agent-native developer platforms.

Why this matters

The release lands during an unusually crowded stretch for model launches. OpenAI shipped GPT-5.5 in late April, Anthropic continues iterating Claude Opus 4.7, Google rolled out Gemini 3.1 Ultra, and Chinese labs including DeepSeek and Tencent have pushed open-source competitors at much lower price points.

Mistral's bet is that a permissively licensed dense model — paired with an async agent runtime — gives regulated industries and sovereignty-conscious European buyers an alternative they can actually deploy on their own infrastructure. Whether enterprises adopt Vibe as the orchestration layer, or use Medium 3.5 weights inside their existing agent stacks, will determine how much of the package sticks.

For developers evaluating the offering today, the calculus is straightforward: a 128B dense model with a 256K context window and four-GPU self-hosting target is a meaningful new option in the open-weight tier — particularly for teams that have been hesitant to depend solely on closed frontier APIs.

Mistral Ships Medium 3.5 and Vibe Remote Agents: 128B Open-Weight Model Targets Async Coding

What Medium 3.5 brings

Vibe and the agent layer

Why this matters

More in Models

NVIDIA Launches Nemotron 3 Nano Omni: 30B Open Multimodal Model With 9x Throughput Edge

xAI Launches Grok Voice Think Fast 1.0, Tops τ-Voice Bench and Powers Starlink Support

Tencent Drops Hy3 Preview: 295B Open-Source MoE Model Kicks DeepSeek Out of Yuanbao