Groq and Meta's Llama-3-70b Surpasses 1,000 Trillion O

ChatLLM Teams - One AI Assistant To Rule Them All Compare all the SOTA LLMS in one place! Check out Llama-3 speed... 🤯🤯 Support open-source and use powerful models at the same time - All at $10 / month https://t.co/n1N2DUENp5 https://t.co/eOwhGWnioH

Rohan Paul@rohanpaul_ai

2 mo

Performance Boosts with Intel's P-Cores: Optimizing Lama.cpp-based Programs for Enhanced LLM Inference Experience! "running Meta-Llama-3-70B-Instruct-64k-i1-GGUF-IQ2_S at 42K on a system with Windows 11 Pro, Intel 12700K processor, RTX 3090 GPU, and 32GB of RAM. By changing the… https://t.co/ZsJtiNBLge

Bart de Witte@OpenMedFuture

2 mo

LLaMA3 4 weeks later and a reality check - Llama 3 has introduced grouped query attention (GQA) across its models, improving inference efficiency and model performance. - Democratization of AI: The "open-source" nature of Llama 3 allows for widespread use and customization,…

Jan@janframework

2 mo

Groq Llama3 & Jan ❤️ "The response generation is so fast that I can't even keep up with it," wrote @1abidaliawan at @kdnuggets. Big shoutout to @GroqInc, @metaai's Llama3. ⚡️ https://t.co/M4QIaSsepi

Shaun Ralston@shaunralston

2 mo

WTF, @GroqInc . . . how is your #LPU inference engine speeding up over time; blazing 1,000+ T/s with @aiatmeta's colossal Llama-3-70b? This must be the fastest LLM for any stack, @sundeep @bensima. https://t.co/uzEgT5ttRj

Similar Stories

Groq and Meta's Llama-3-70b Surpasses 1,000 Trillion Ops, Runs 42K T/s

Similar Stories

Sources

Groq and Meta's Llama-3-70b Surpasses 1,000 Trillion Ops, Runs 42K T/s