Recent advancements in AI technology have led to significant improvements in large language models, particularly the Llama3 series. Notable developments include the introduction of several new models such as Hermes 2 Pro Llama 3, Llama 3 OpenBio LLM for the medical domain, and Llama3-ChatQA-1.5-8B by Nvidia. These models have shown remarkable performance, with some fine-tuned versions scoring higher than the Llama-3-70B and achieving top ranks in MMLU/GSM8K benchmarks on the Huggingface leaderboard. Companies like Groq Inc. and AIatMeta have also contributed to these advancements, with Groq Inc. setting new standards for throughput and AIatMeta introducing models with extended context lengths capable of perfect retrieval scores for NIAH.
Nice - LLama-3-70B with 1048k context length. trained on 34M tokens for this stage, and ~430M tokens total for all stages, which is < 0.003% of Llama-3's original pre-training data. https://t.co/8PQjIF5UT0
We’re going back 2 back! 🔥 Introducing the first 1M context window @AIatMeta Llama-3 70B to pair with the our Llama-3 8B model that we launched last week on @huggingface. Our 1M context window 70B model landed a perfect score on NIAH and we’re excited about the results that… https://t.co/d8g8hEKm5r
We’re going back 2 back! 🔥 Introducing the first 1M context window @AIatMeta Llama-3 70B to pair with the 1M context window Llama-3 8B that we launched last week on @huggingface! Our 1M context window 70B model landed a perfect score on NIAH and we’re excited about the results… https://t.co/y1ieIUCaO8
Nvidia's Llama 3 Chat QA 1.5 models are quite 🔥 > Finetuned Llama 3 8B and 70B > 8B beats Command R Plus on ChatRAG bench (in picture) ⚡ > Fine-tuned specifically for Chat and RAG use-cases > Builds on ChatQA 1.0 recipe by adding more tabular, arithmetic and QA data > Release… https://t.co/qXY4qDZa0J
AutoTrain finetuned llama3-70b is now one of the top models on the Open LLM Leaderboard 🚀 This model used peft and no quantization. A single 8xH100 was used to train this model and it took ~2.5hours. No code was written to train this model 🤪 1/2 https://t.co/UL9TgZ50SL
It's Friday night and our kitchen’s still open! 🍳 Now serving a 524K context window @AIatMeta Llama-3 70b on @huggingface, which has been trained on a more extensive proprietary chat dataset to give the model chat ability over long sequences! The best part is that we’re still… https://t.co/YrLzSjZxxt
It's Friday night and our kitchen’s still open! 🍳 Now serving a 524K context window Meta Llama-3 70b on @huggingface, which has been trained on a more extensive proprietary chat dataset to give the model chat ability over long sequences! The best part is that we’re still cooking… https://t.co/0nRZuzdnrV
Introducing ChatQA-1.5, a family of models (Llama3-ChatQA-1.5-8B and Llama3-ChatQA-1.5-70B) that excel at conversational QA and RAG. We also open source our instruction tuning data, ChatRAG Bench for evaluation, and a multi-turn QA retriever. Link: https://t.co/2uxHQnfzZB https://t.co/gR9AOiHZLJ
Introducing ChatQA-1.5, a family of models that surpasses GPT-4-0613 and Command-R-Plus on RAG and conversational QA. ChatQA-1.5 has two variants: Llama3-ChatQA-1.5-8B, https://t.co/H7JvIFCD48 Llama3-ChatQA-1.5-70B, https://t.co/Ao3Yw8ECxA We also open source our instruction…
Introducing ChatQA-1.5, a family of models that surpasses GPT-4-0613 and Command-R-Plus on RAG and conversational QA. ChatQA-1.5 has two variants: Llama3-ChatQA-1.5-8B, https://t.co/H7JvIFCD48 and Llama3-ChatQA-1.5-70B, https://t.co/Ao3Yw8ECxA We also open source our…
Don't worry, we didn't forget about 70b🥳 Take a look at the first @AIatMeta Llama-3 70b model with a context length of 262K - scoring a perfect retrieval for NIAH! We included an extensive proprietary chat dataset to give the model chat ability over long sequences as well. A… https://t.co/toYrUUTRJH
You didn’t think we would forget about 70b did you? 🥳 Take a look at the first @AIatMeta Llama-3 70b model with a context length of 262K - scoring a perfect retrieval for NIAH! We included an extensive proprietary chat dataset to give the model chat ability over long sequences… https://t.co/TyQUO8KA0p
We're excited to see @ArtificialAnlys 's newly launched leaderboard on @Huggingface with @GroqInc continuing to set the bar for throughput (tokens/s). Groq's throughput for Llama 3 70B exceeds what the vast majority of providers can deliver for Llama 3 8B https://t.co/wa5gqnEBfe
🚀 The first fine-tuned models to score higher than Llama-3-70B & achieve the best MMLU/GSM8K at the same time! - 3 out of the top 10 models on the Open LLM Leaderboard are now dominated by these fine-tuned models - Achieved the highest MMLU / GSM8K on @huggingface Leaderboard https://t.co/UeVaHPOe6t
Some awesome new models in the 🤗 MLX community: - Hermes 2 Pro Llama 3 by @NousResearch - Llama 3 OpenBio LLM for medical domain by @aadityaura - Llama3-ChatQA-1.5-8B by @nvidia All here: https://t.co/dUgErUXnM3 h/t @vkash16, @lucataco93, @Prince_Canuma for conversions!
Meta’s Llama3 AI: ChatGPT Intelligence… For Free! https://t.co/P4BTfuNjxP