The lmsys team has introduced RouteLLM, a new open-source routing framework designed to reduce the costs of using large language models (LLMs) like GPT-4. By directing simpler queries to cheaper models, RouteLLM claims to achieve cost reductions of over 85% on MT Bench and 45% on MMLU while maintaining 95% of the quality. This initiative is reportedly 40% cheaper than existing routers such as Martian. Developed in collaboration with Anyscale, RouteLLM uses human preference data to intelligently select the best model for each query, ensuring cost-effectiveness without compromising performance.
RouteLLM sounds very promising. SLMs are often sufficient for everyday use in particular. Nevertheless, the large models are used. In this respect, it is to be welcomed that a router selects the best model against the background of inference and its costs and latency. https://t.co/3hTd7yokAY
RouteLLM: An Open-Source Framework for Cost-Effective LLM Routing https://t.co/eMCar1PCBA
A very useful pricing sheet from @_philschmid for top LLMs! Look, Gemini 1.5 Pro is 2 times cheaper than GPT-4o and Claude 3.5 Sonnet. And LLMs like Llama 3 70b and Deepseek v2 models offer not only high-quality performance but also cost-effectiveness👇 https://t.co/lt9NpcUvt1 https://t.co/H6JZVQQAap
1/ 🚀 Introducing RouteLLM: a routing framework based on human preference data for routing queries between powerful proprietary LLMs and cost-effective LLMs, developed in collaboration with @lmsysorg . By intelligently selecting the best model for each query, our router models… https://t.co/TUpm1PJPpX
Automatically route LLm calls to the best LLm Very cool idea, and they have the perfect data for it https://t.co/TEGkeBxSGS
LMsys ream releases a router! given that some querires don't really need the full bulk of a big model, they have released a framework that will route (and reduce cost) your queries to different models! Super smart, will be testing this very soon! 👀 https://t.co/AheRBEN5nq
Amazing initiative from lmsys team! The llm router competition heats up. Claims to be up to 45-85% cheaper than GPT-4-only depending on the type of questions routed (MT-bench vs MMLU), while preserving 95% of quality. Reportedly 40% cheaper than existing routers (eg Martian). https://t.co/P1FaPlZsVE
Not all questions need GPT-4! We introduce RouteLLM – a routing framework based on human preference data that directs simple queries to a cheaper model. With data augmentation techniques, RouteLLM achieves cost reductions of over 85% on MT Bench and 45% on MMLU while… https://t.co/hXmuO1FfW2