Loading...
Jina AI and Nomic AI have released new state-of-the-art multimodal embedding models that outperform OpenAI CLIP in text-image retrieval. Jina AI's Jina CLIP v1 includes ONNX weights, making it compatible with Transformers.js v3 and capable of running with WebGPU acceleration. Nomic AI's Nomic-Embed-Vision integrates text embeddings into a multimodal space, allowing for high-quality image, text, and multimodal tasks. This model also outperforms OpenAI CLIP and text-embedding-3-small. Nomic Embed Vision supports 8k context length and outperforms JinaAI_ CLIP. Additionally, Nomic AI's embeddings have been used to create a semantic search tool for The Met's collection of 250,000 artworks, enabling efficient and precise searches over large datasets using databases like MongoDB and weaviate_io. This tool is the first ever of its kind.
Using the neural search feature, I was able to find a sculpture of a woman holding flowers, in less than a second! This might take an intern at a museum quite a while, sifting through archives of hundreds of thousands of pieces! You can't currently search semantically https://t.co/9qC6vn9cFd https://t.co/tOMtyewKhf
Super excited to finally share thisβto our knowledge this is the largest ever tool to search *semantically* over this large of a collection! TLDR: We turned the @metmuseum into a vector db.... it's pretty crazy what you can do with this! https://t.co/tOMtyewKhf
We embedded 250,000 works of art π¨ from The Met using @nomic_ai's new SOTA #multimodal embeddings model! It's the *first ever* semantic search tool of its kind π©βπ¨ π Search with smart queries like "oil painting with flowers & dogs". How we did it & how to use itπ https://t.co/sWjW78zUtI
Announcing Nomic Embed Vision π - All Nomic Text Embeddings are now multimodal in the v1 and v1.5 latent space. - 8k context length, outperforms OpenAI CLIP and @JinaAI_ CLIP - Open data, model weights and training code https://t.co/1SgD1dDuwX
All your nomic embed text embeddings sitting in vector DB's like @MongoDB and @weaviate_io are now multimodal! Use them to search over any image dataset embedded with Nomic Embed Vision! Best of all? Nomic Embed Vision outperforms OpenAI CLIP! https://t.co/8ng0Q6YtLA
Today, every Nomic-Embed-Text embedding becomes multimodal. Introducing Nomic-Embed-Vision: - a high quality, unified embedding space for image, text, and multimodal tasks - outperforms both OpenAI CLIP and text-embedding-3-small - open weights and code to enable indie⦠https://t.co/uYH97GiwtV
Jina CLIP v1 just released: a new state-of-the-art multimodal embedding model that outperforms OpenAI CLIP in text-image retrieval! π We also contributed ONNX weights so it's now compatible with π€ Transformers.js v3 and runs with WebGPU acceleration! β‘οΈ Try out the demo! π https://t.co/3XAs6j5qTC