Back to stories
Models

Meta Releases SAM 3.1 — Open-Source Image Segmentation Coming to Instagram

Michael Ouroumis2 min read
Meta Releases SAM 3.1 — Open-Source Image Segmentation Coming to Instagram

Meta dropped a new version of one of the most useful open-source computer vision models in the field.

SAM 3.1 — the latest iteration of Meta's Segment Anything Model — is now available on Hugging Face and GitHub, with inference code, finetuning code, downloadable checkpoints, and example notebooks. The release also confirmed that SAM 3 technology is coming to Instagram Edits and Vibes in the Meta AI app.

What SAM Does

Segment Anything is exactly what it sounds like. You point at something — with a click, a box, text, or another visual prompt — and SAM precisely outlines and isolates it. In images or video.

This is the kind of technology that makes "remove the background" buttons work well. It's what powers "select the person" in photo editors. SAM is the open-source version that anyone can build with, study, or improve.

What's New in 3.1

The headline addition in 3.1 is finetuning support. Previous releases included inference code — you could run the model — but 3.1 adds the training infrastructure to adapt it to new domains. Medical imaging, satellite imagery, industrial inspection, autonomous vehicles: any field with specialized segmentation needs can now finetune SAM 3 on domain-specific data.

That matters because "segment anything" works well for general consumer images, but specialized applications need a model that understands their specific objects, lighting conditions, and contexts. Finetuning support turns SAM from a useful tool into a foundation model that specialists can actually build on.

The Consumer Application

Meta confirmed that SAM 3 is headed to Instagram Edits and Vibes — the company's consumer video and photo editing surfaces. For most users, that means better object selection, cleaner cutouts, and more accurate background editing in Instagram's native tools.

It's also a significant distribution play: SAM's capabilities will reach billions of users through apps they're already using, not just developers who build with the API.

Why Open-Source Matters Here

Meta has been consistently open-sourcing its computer vision and AI research. SAM, Llama, and related models have become foundational pieces of the AI ecosystem precisely because they're available to build with, study, and improve.

SAM 3.1's finetuning release extends that pattern. Instead of keeping the best capabilities inside Meta products, the company publishes the tools that let the broader research and developer community push the technology further — and build applications Meta's own teams haven't imagined yet.

Learn AI for Free — FreeAcademy.ai

Take "AI Essentials: Understanding AI in 2026" — a free course with certificate to master the skills behind this story.

More in Models

Thinking Machines Lab Debuts 'Interaction Models' — Mira Murati's First Step Into Frontier AI
Models

Thinking Machines Lab Debuts 'Interaction Models' — Mira Murati's First Step Into Frontier AI

Mira Murati's Thinking Machines Lab released a research preview of 'interaction models,' a new class of full-duplex multimodal AI that listens, sees and speaks at the same time, with turn-taking latency reported at about 0.4 seconds.

6 hours ago2 min read
OpenAI Ships GPT-Realtime-2 With Live Translation and Streaming Whisper, Pushing Voice Agents Toward GPT-5 Reasoning
Models

OpenAI Ships GPT-Realtime-2 With Live Translation and Streaming Whisper, Pushing Voice Agents Toward GPT-5 Reasoning

OpenAI launched three new audio models on May 7 — GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper — adding GPT-5-class reasoning, a 128K context window, and metered live translation across 70 languages to its already-GA Realtime API.

3 days ago2 min read
Zyphra Releases ZAYA1-8B, the First Frontier-Class Reasoning MoE Trained Entirely on AMD
Models

Zyphra Releases ZAYA1-8B, the First Frontier-Class Reasoning MoE Trained Entirely on AMD

Zyphra's ZAYA1-8B is an Apache-licensed reasoning MoE with under 1B active parameters that matches Claude 4.5 Sonnet and Gemini 2.5 Pro on math benchmarks — and it was pretrained, midtrained, and fine-tuned on AMD Instinct MI300X GPUs.

4 days ago2 min read