RWKV-v5 Introduces Eagle-7B, an Attention-Free LLM Tra

RWKV-5 "Eagle" 7B is Mistral-7B level for language modeling of unseen arxiv CS & Physics papers, and significantly better than Llama2🐦We are testing more new data. https://t.co/Pm6i6vowUH https://t.co/DEHGgjwKbp

wasmedge@realwasmedge

5 mo

🔥 “Small” LLMs are the ones that have 1-2B parameters (instead of 7-200B). They are still trained with trillions of words. The idea is to push the envelope on “information compression” to develop models that can be much faster and much smaller for specialized use cases, such as… https://t.co/v1b4UFZTeJ

Rohan Paul@rohanpaul_ai

5 mo

📌 The Receptance Weighted Key Value (RWKV, the architecture behind Eagle-7B) introduced by Peng et al. aims to reconcile the trade-off between computational efficiency and model performance in sequence processing tasks. 📌 RWKV combines aspects of both Transformers and RNNs… https://t.co/XOYvi3wDhK https://t.co/KncPEkfLmO

Lior⚡@AlphaSignalAI

5 mo

Big. An RNN-based LLM just outperformed transformers. Eagle-7B is a new attention-free LLM with 1 Trillion Tokens Across 100+ Languages. It's RWKV-v5 architecture uses RNNs instead of the transformer architecture allowing 10-100x lower inference cost, speed, and longer context… https://t.co/BL1UPsH4k3

Alex Yanko 🇺🇦@LeopolisDream

5 mo

Eagle 7B : Soaring past Transformers with 1 Trillion Tokens Across 100+ Languages A brand new era for the RWKV-v5 architecture and linear transformer's has arrived - with the strongest multi-lingual model in open source today https://t.co/ByYjec1VhM

Adina Yakup@AdeenaY8

5 mo

RWKV-v5 Eagle 7B is out 🔥 ✨ Trained on 1.1 Trillion Tokens across 100+ languages 📄 Apache 2.0 🚀 Outperforms all 7B class models Model: https://t.co/3MREbmukgj Demo: https://t.co/u0WJhAprwv 💡Check the blog, their response to the question about the muiti-lingual is really cool… https://t.co/zwrWHW3cIn

Gradio@Gradio

5 mo

🦅 Eagle 7B: RWKV (RNNs) Soaring past Transformers with 1 Trillion Tokens Across 100+ Languages 🚀Outperforms all 7B class models in multi-lingual benchmarks. Perhaps, a good dataset + Scalable architecture: is all you need?🤓 👏Licensed as Apache 2.0 license. Demo on Spaces! https://t.co/AcTfRDcsUk

ai geek (wishesh) ⚡️@aigeek__

5 mo

a new RWKV (non-transformer-architecture) LLM called Eagle 7B has just been released. this model stands out by being competitive with Mistral 7B models and excels, in particular, at handling multilingual tasks. model and demo links 👇 https://t.co/MnlcEmgmYs

RWKV@RWKV_AI

5 mo

Introducing Eagle-7B Based on the RWKV-v5 architecture, bringing into opensource space, the strongest - multi-lingual model (beating even mistral) - attention-free transformer today (10-100x+ lower inference) With comparable English performance with the best 1T 7B models https://t.co/hWtEMC1264

BlinkDL@BlinkDL_AI

5 mo

RWKV-5 "Eagle" 7B: beats Mistral-7B at multilingual, reaches Llama2-7B level at English, while being 100% attention-free RNN and only trained 1.1T tokens. Gradio Demo: https://t.co/k0AivnxCwP RWKV-6 "Finch" 1B5 in ~10days, 3B in ~30days. https://t.co/c6dByjF976

Similar Stories

RWKV-v5 Introduces Eagle-7B, an Attention-Free LLM Trained on 1.1 Trillion Tokens, Outperforming Mistral-7B in Multilingual Tasks, Licensed as Apache 2.0

Similar Stories

Sources

RWKV-v5 Introduces Eagle-7B, an Attention-Free LLM Trained on 1.1 Trillion Tokens, Outperforming Mistral-7B in Multilingual Tasks, Licensed as Apache 2.0