Nomic AI has introduced a new open source text embedding model, Nomic Embed, which outperforms OpenAI's text-embedding-3-small, Ada, and Jina on both short and long context benchmarks. This model is fully open, with open source code, open data, and open weights, and is available under the Apache 2 License. It has been integrated with platforms such as LangChain, LlamaIndex, and MongoDB from day one. The Nomic AI team has also released 235 million text pairs for training and the full training recipe, enabling others to train their own state-of-the-art (SOTA) text embedding models.
nomic-embed-text-v1 (@nomic_ai) is the latest SOTA long-context embedding model. A cool feature is that it not only outperforms @OpenAI ada and jina-embeddings-v2 but the authors have also released the full training recipe so that anyone can build a SOTA embedding model from… https://t.co/P6Svkv6NbH https://t.co/zjrO9MGt3v
nomic-embed-text-v1 (@nomic_ai) is the latest SOTA long-context embedding model. A cool feature is that it not only outperforms @OpenAI ada and jina-embeddings-v2 but the authors have also released the full training recipe so that anyone can build a SOTA embedding model from… https://t.co/0LN3dWevdJ https://t.co/zjrO9MGt3v
What if we told you there was an embedding that beats OpenAI's brand new text-embedding-3-small, and it's open source? Not only open source, but open data and open training code? Introducing @nomic_ai's nomic-embed-text-1! And it's integrated with LlamaIndex on day 0! How to use… https://t.co/W0MxDdqaDf
Nomic Embed is a new open source embedding model that beats even @OpenAI's new text-embedding-3 The @nomic_ai team is too epic. 💪 https://t.co/jbkgQ0BBAO
Congrats to @nomic_ai for launching the SOTA embedding model! And it is truly open source - open data, open weights, and open code. https://t.co/QrMimJaiGb
Announcing Nomic Embed 🧨 You can now train your own OpenAI quality text embedding model. - Open source, fully reproducible text embedding model that beats OpenAI and Jina on long context tasks. - 235M text pairs openly released for training 💰 - Apache 2 License https://t.co/YKaEBNKmVE
Embeddings are at the core of our business model at Nomic AI. That's why we took the time and effort to train the best long-context text embedding model there is, to integrate across our system, and to open source everything about it--code, data, and weights. https://t.co/XiZn1QbnED
Introducing Nomic Embed - the first fully open long context text embedder to beat OpenAI - Open source, open weights, open data - Beats OpenAI text-embeding-3-small and Ada on short and long context benchmarks - Day 1 integrations with @langchain, @llama-index, @MongoDB https://t.co/miDGh2OVhv