Alibaba has announced the release of Qwen-VL, which outperforms GPT-4V and Gemini on several benchmarks. New versions called Qwen-VL-Plus and Qwen-VL-Max have been released with significant improvements to visual reasoning, text recognition, and other capabilities. Qwen-VL-Max has been made open source on HuggingFace and Github. The release has been well-received with promises of a public release of code, models, and demos soon. Additionally, LLaVA-1.6 has been released, exceeding Gemini Pro on several benchmarks, supporting higher-res inputs, and maintaining data efficiency.
LLaVA 1.6 is out! 🥳 - Outperforms Gemini PRO on some benchmarks - Higher resolution than LLaVA 1.5 (up to 4x more pixels!) - Better OCR capability and instruction-following - More conversational Models: https://t.co/200Qffi6fM Blog: https://t.co/nh5TaTHH3W https://t.co/kYrE7O2V1O
🚀We are thrilled to release LLaVA-1.6, with improved reasoning, OCR, and world knowledge. It supports higher-res inputs, more tasks, and exceeds Gemini Pro on several benchmarks! 🤯 It maintains the data efficiency of LLaVA-1.5, and LLaVA-1.6-34B is trained ~1 day with 32 A100s.… https://t.co/nGRpLX8FQv
Thanks to @_akhaliq for sharing. The code, models and demo will be made public soon! 🤗 Check out our progress at https://t.co/Vvm7B76zAd. https://t.co/8ZS4dy1lKf
Qwen-VL and is pretty good https://t.co/W8YFF1Dh5V
Qwen-VL definitely outperforms every other local vision model I've tested. Claims of outperforming GPT-4V seem exaggerated. GPT-4V has GPT's better ability to reason, so asking more complex questions about pictures there's almost no contest. Will post comparisons later https://t.co/kBgRR3JNiU
🇨🇳 AliBaba matches GPT-4V vision model TLDR: Release Qwen-VL-Max open source on HuggingFace and Github > User: Dude, where's my (red) car? > Qwen: Let me SHOW you > User: Do my homework problem > Qwen: (it solved for surface area and volume, step by step) > User: Explain how… https://t.co/rjtDV8UnPi
Thanks @_akhaliq for sharing our latest Qwen-VL. Preview: The next release is coming soon! 🚀 https://t.co/Gwi2WiKIYD
New versions called Qwen-VL-Plus and Qwen-VL-Max have been released with significant improvements to visual reasoning, text recognition, and other capabilities. Demo link: https://t.co/rBiz0lNony "Compared to the open-source version of Qwen-VL, these two models perform on par… https://t.co/Xvwx5GlG1V
Alibaba announces Qwen-VL demo: https://t.co/hTv2ZOURtJ blog: https://t.co/UIgrfXbIyF Qwen-VL outperforms GPT-4V and Gemini on several benchmarks. https://t.co/lYi5QY22as