Meta Launches SAM Audio: The First Unified Multimodal Model That Isolates Any Sound from Complex Mixtures with Intuitive Prompts

Meta unveiled SAM Audio on December 16, 2025 — the groundbreaking extension of its Segment Anything family into audio, claiming the world's first unified multimodal model for sound separation. It isolates specific sounds like vocals, instruments, or ambient noise using text descriptions, visual clicks in videos, or time-span markings — alone or combined — all in a seamless, prompt-driven workflow. Open-sourced with small, base, and large variants, plus benchmarks and a perception encoder, it's now live on the Segment Anything Playground and Hugging Face, slashing barriers for creators and accelerating innovations in editing, accessibility, and beyond.

Meta Drops SAM 3D: The SAM Evolution That Turns Single Images into Photorealistic 3D Worlds, Crushing Occlusion Nightmares for AR and Robotics

Meta AI unveiled SAM 3D on November 19, 2025 — a groundbreaking extension of the Segment Anything Model family that reconstructs full 3D geometry, textures, and poses from just one everyday photo. Featuring dual powerhouses SAM 3D Objects (for scenes and clutter-crushing object meshes) and SAM 3D Body (for human shape estimation), it leverages a massive human-feedback data engine to outperform rivals on real-world benchmarks. With open checkpoints, inference code, and a new eval suite now live on GitHub and Hugging Face, SAM 3D slashes 3D capture time from hours to seconds — igniting a firestorm in AR/VR, robotics, and VFX where traditional multi-view setups just got obsoleted.

Telegram
Telegram
WhatsApp
WhatsApp