Segment Anything Model (SAM) — or just Segment Anything — was developed by Meta AI to precisely segment or “cut out” any object from any image with just a single click.
SAM can perform this task through its promptable segmentation system that showcases a remarkable zero-shot generalization ability, allowing it to recognize and segment objects in images it has never seen before without additional model training. This advanced capability makes SAM versatile for many tasks, enabling users to specify what to segment using various prompts such as text, points, bounding boxes, or even automatically segmenting everything visible in an image. Furthermore, the model can generate multiple valid masks for ambiguous prompts, demonstrating its nuanced understanding of objects and their segmentation.
SAM’s power lies in its sophisticated training and flexible design. The model has been trained on a vast dataset comprising over 11 million images and more than 1 billion segmentation masks, generated through a unique model-in-the-loop “data engine” that iteratively improved both the model and the dataset. This extensive training allows SAM to develop a general notion of objects and how to segment them efficiently.
The model is structured into a one-time image encoder and a lightweight mask decoder that can operate in a web browser within milliseconds. SAM could significantly impact various applications, from tracking objects in videos and photo editing to AR/VR experiences and creative tasks. And, yes, we’re hoping to see it in some real-world action soon.