Llama3 AI Models by Nvidia and Others Set New Records

Nice - LLama-3-70B with 1048k context length. trained on 34M tokens for this stage, and ~430M tokens total for all stages, which is < 0.003% of Llama-3's original pre-training data. https://t.co/8PQjIF5UT0

Gradient@Gradient_AI_

2 mo

We’re going back 2 back! 🔥 Introducing the first 1M context window @AIatMeta Llama-3 70B to pair with the our Llama-3 8B model that we launched last week on @huggingface. Our 1M context window 70B model landed a perfect score on NIAH and we’re excited about the results that… https://t.co/d8g8hEKm5r

Gradient@Gradient_AI_

2 mo

We’re going back 2 back! 🔥 Introducing the first 1M context window @AIatMeta Llama-3 70B to pair with the 1M context window Llama-3 8B that we launched last week on @huggingface! Our 1M context window 70B model landed a perfect score on NIAH and we’re excited about the results… https://t.co/y1ieIUCaO8

Vaibhav (VB) Srivastav@reach_vb

2 mo

Nvidia's Llama 3 Chat QA 1.5 models are quite 🔥 > Finetuned Llama 3 8B and 70B > 8B beats Command R Plus on ChatRAG bench (in picture) ⚡ > Fine-tuned specifically for Chat and RAG use-cases > Builds on ChatQA 1.0 recipe by adding more tabular, arithmetic and QA data > Release… https://t.co/qXY4qDZa0J

abhishek@abhi1thakur

2 mo

AutoTrain finetuned llama3-70b is now one of the top models on the Open LLM Leaderboard 🚀 This model used peft and no quantization. A single 8xH100 was used to train this model and it took ~2.5hours. No code was written to train this model 🤪 1/2 https://t.co/UL9TgZ50SL

Gradient@Gradient_AI_

2 mo

It's Friday night and our kitchen’s still open! 🍳 Now serving a 524K context window @AIatMeta Llama-3 70b on @huggingface, which has been trained on a more extensive proprietary chat dataset to give the model chat ability over long sequences! The best part is that we’re still… https://t.co/YrLzSjZxxt

Gradient@Gradient_AI_

2 mo

It's Friday night and our kitchen’s still open! 🍳 Now serving a 524K context window Meta Llama-3 70b on @huggingface, which has been trained on a more extensive proprietary chat dataset to give the model chat ability over long sequences! The best part is that we’re still cooking… https://t.co/0nRZuzdnrV

Zihan (Johan) Liu@zihan_johan_liu

2 mo

Introducing ChatQA-1.5, a family of models (Llama3-ChatQA-1.5-8B and Llama3-ChatQA-1.5-70B) that excel at conversational QA and RAG. We also open source our instruction tuning data, ChatRAG Bench for evaluation, and a multi-turn QA retriever. Link: https://t.co/2uxHQnfzZB https://t.co/gR9AOiHZLJ

Wei Ping@_weiping

2 mo

Introducing ChatQA-1.5, a family of models that surpasses GPT-4-0613 and Command-R-Plus on RAG and conversational QA. ChatQA-1.5 has two variants: Llama3-ChatQA-1.5-8B, https://t.co/H7JvIFCD48 Llama3-ChatQA-1.5-70B, https://t.co/Ao3Yw8ECxA We also open source our instruction…

Wei Ping@_weiping

2 mo

Introducing ChatQA-1.5, a family of models that surpasses GPT-4-0613 and Command-R-Plus on RAG and conversational QA. ChatQA-1.5 has two variants: Llama3-ChatQA-1.5-8B, https://t.co/H7JvIFCD48 and Llama3-ChatQA-1.5-70B, https://t.co/Ao3Yw8ECxA We also open source our…

Gradient@Gradient_AI_

2 mo

Don't worry, we didn't forget about 70b🥳 Take a look at the first @AIatMeta Llama-3 70b model with a context length of 262K - scoring a perfect retrieval for NIAH! We included an extensive proprietary chat dataset to give the model chat ability over long sequences as well. A… https://t.co/toYrUUTRJH

Gradient@Gradient_AI_

2 mo

You didn’t think we would forget about 70b did you? 🥳 Take a look at the first @AIatMeta Llama-3 70b model with a context length of 262K - scoring a perfect retrieval for NIAH! We included an extensive proprietary chat dataset to give the model chat ability over long sequences… https://t.co/TyQUO8KA0p

Groq Inc@GroqInc

2 mo

We're excited to see @ArtificialAnlys 's newly launched leaderboard on @Huggingface with @GroqInc continuing to set the bar for throughput (tokens/s). Groq's throughput for Llama 3 70B exceeds what the vast majority of providers can deliver for Llama 3 8B https://t.co/wa5gqnEBfe

Maziyar PANAHI@MaziyarPanahi

2 mo

🚀 The first fine-tuned models to score higher than Llama-3-70B & achieve the best MMLU/GSM8K at the same time! - 3 out of the top 10 models on the Open LLM Leaderboard are now dominated by these fine-tuned models - Achieved the highest MMLU / GSM8K on @huggingface Leaderboard https://t.co/UeVaHPOe6t

Awni Hannun@awnihannun

2 mo

Some awesome new models in the 🤗 MLX community: - Hermes 2 Pro Llama 3 by @NousResearch - Llama 3 OpenBio LLM for medical domain by @aadityaura - Llama3-ChatQA-1.5-8B by @nvidia All here: https://t.co/dUgErUXnM3 h/t @vkash16, @lucataco93, @Prince_Canuma for conversions!

Two Minute Papers@twominutepapers

2 mo

Meta’s Llama3 AI: ChatGPT Intelligence… For Free! https://t.co/P4BTfuNjxP

Similar Stories

Llama3 AI Models by Nvidia and Others Set New Records on Huggingface Leaderboard

Similar Stories

Sources

Llama3 AI Models by Nvidia and Others Set New Records on Huggingface Leaderboard