Google has unveiled Gecko, a new compact text embedding model that stands out for its ability to compete with models seven times its size and with five times higher dimensional embeddings, specifically with 768 embedding dimensions. Developed by Google DeepMind, including contributions from Jinhyuk Lee, Z Dai, X Ren, B Chen, and others, Gecko leverages the distillation of knowledge from large language models (LLMs) into a more manageable size without sacrificing performance. This distillation process involves the use of synthetic data and LLM-based relabeling to refine the model's capabilities. Gecko has been recognized for its strong retrieval performance and versatility, making it a significant advancement in the field of text embeddings. It has also been highlighted as the strongest model on the Massive Text Embedding Benchmark (MTEB) that fits under 768 dimensions, available for use on Google Cloud for various applications such as RAG, retrieval, and vector databases.
Great new work from our team and colleagues at @GoogleDeepMind! On the Massive Text Embedding Benchmark (MTEB), Gecko is the strongest model to fit under 768-dim. Try it on @googlecloud. Use it for RAG, retrieval, vector databases, etc. https://t.co/dOZjKX1pcT
Gecko: Versatile Text Embeddings Distilled from Large Language Models Proposes a compact yet high-performing text embedding model created by distilling knowledge from LLMs into a retriever through synthetic data & refining it with LLM-based relabeling. šhttps://t.co/bxq0r4eTUG https://t.co/jvBSahnrX7
Gecko, a compact text embedding model by Jinhyuk Lee et al., outpaces larger models by distilling LLM knowledge through innovative synthetic data refinement: https://t.co/1mHD4jwbA1 https://t.co/f7t8iICAHi
[CL] Gecko: Versatile Text Embeddings Distilled from Large Language Models J Lee, Z Dai, X Ren, B Chenā¦ [Google DeepMind] (2024) https://t.co/Vj4x9Rz7Co - The paper presents Gecko, a compact yet versatile text embedding model powered by distilling knowledge from large languageā¦ https://t.co/DJnneGxaaa
Google announces Gecko Versatile Text Embeddings Distilled from Large Language Models We present Gecko, a compact and versatile text embedding model. Gecko achieves strong retrieval performance by leveraging a key idea: distilling knowledge from large language models (LLMs) https://t.co/eftqOpkkqa
Google presents Gecko Versatile Text Embeddings Distilled from Large Language Models Gecko with 768 emb dim competes with 7x larger models and 5x higher dimensional embeddings https://t.co/R9sQ8hpTGb https://t.co/wbjfsiGy4r