Microsoft has released Florence-2, a new vision foundation model capable of handling a variety of vision tasks such as captioning, object detection, segmentation, OCR, and region proposal. The model comes in different sizes, including 200M, 800M, 230M, 770M, and 540M parameters. Florence-2 is MIT licensed and is available on Hugging Face. It outperforms larger models like Flamingo-80B, despite being significantly smaller and offers quality comparable to models 100x larger. Both pre-trained and fine-tuned versions are available.
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks - a novel vision foundation model with a unified, prompt-based representation for a variety of computer vision and vision-language tasks. https://t.co/IlPBw3z9na
Run Florence-2 on Your Local Machine Florence-2 is a new vision model from Microsoft that excels at all kinds of vision tasks (text recognition, object detection, ..) I ported this huggingface gradio app to run on all machines (mac, linux, windows) and wrote a 1 click launcher. https://t.co/oKpHIDpRsG https://t.co/hHmFbH9A6Q
looks like Florence-2 is really good at OCR… it’s co cool to have models like this under MIT license https://t.co/aXV1rpfjyu
Consume-Florence2 is on the way. Will push to my github in the next few hours. Here is the Huggingface repo if interested: https://t.co/76osohtjG7 #airesearch
Florence-2 is finally out! 1 model; 10+ computer vision tasks! ↓ key takeaways are listed below. see my blog post for details. link: https://t.co/X03LCsjSOH https://t.co/alYTIKnhYT
Florence-2 is a new vision foundation model by MSFT capable of a wide variety of tasks 🤯 Let's unpack! 🧶 Demo, models and more on the next one 🐣 https://t.co/Frf0blc99M
Wednesday afternoon session of posters #CVPR2024 Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks [Poster #102] TL;DR: Florence-2 is a vision foundation model designed for diverse computer vision and vision-language tasks using a unified,… https://t.co/e8rUQ5BnSM
Microsoft drops Florence-2, a unified model to handle a variety of vision tasks: As of now, both pre-trained and fine-tuned versions of Florence-2 232M and 771M are available on Hugging Face under a permissive MIT license. https://t.co/Kxc33nvTmu #AI #Business
Microsoft drops Florence-2, a unified model to handle a variety of vision tasks https://t.co/W8VxEcN1xe
Microsoft Florence-2 looks like a game changer for a lot of vision tasks 🤯 (Great accuracy + insanely fast) Try it here: https://t.co/ILCxqfsWLN https://t.co/62FqQDOmiQ
🔥Microsoft drops Florence-2: Tiny Vision foundation models that slay!! 🚀 🤯230M & 770M Models (Base + FT). 230M beats Flamingo-80B in Zero-Shot! https://t.co/WKo0DsKkgF
New vision model from @Microsoft, Florence-2 - Can perform various tasks: object detection, grounding, segmentation, OCR - 200M and 800M models https://t.co/Wp8LMjkM0M
🚀 Florence-2 large from Microsoft is 🤯, a cutting-edge model with 540M parameters! 🌟 🚀 Microsoft just released Florence-2 large 🤯, a 540M parameter vision-language behemoth! 🌟 - MIT Licensed ✅ - Powerful: Outperforms Flamingo 80B (~ 200x larger model) in multiple vision…
Smol VLMs ftw! Microsoft just dropped Florence - SOTA 200M & 800M parameter vision foundation model! 🔥 > Best part MIT Licensed! 🤯 > 200M checkpoint beats Flamingo 80B (400x bigger model) by a huge margin > Performs captioning, object detection and segmentation, OCR, phrase… https://t.co/X2xIdbVCrV
Woah what??? Microsoft just dropped Florence-2 on @huggingface with an MIT license!! Pretty huge. Florence was initially @Microsoft’s internal CLIP model, and they now expanded it to do various tasks like captioning, object detection, OCR, … just by prompting the model https://t.co/xfCQ666j8v
Microsoft just silently dropped Florence 👀Vision model that can tackle many vision tasks (captioning, detection, region proposal, OCR) 🤏Small models (200M and 800M) with ~quality to models 100x larger 🔥MIT licensed Paper and models: https://t.co/FlRzmdtAj3