Mistral AI Releases Mixtral-8x7B Model, Outperforms Ex

We just got more details on Mixtral 8x7B from @MistralAI 🧠 Mixtral is sparse mixture of expert models (SMoE) with open weights outperforming existing open LLMs like Meta Llama 70B.🤯 💪🏻 TL;DR: ⬇️ https://t.co/uMJeebqL2G

main@main_horse

7 mo

interesting: mistral has their largest model, "mistral-medium", on their cloud API. no details on what it is, or if it'll ever be open-sourced, other than that it outperforms mixtral-8x7B by a long shot. https://t.co/VaWhLOBPtt

Teknium (e/λ)@Teknium1

7 mo

.@MistralAI just released their blog post on Mixtral MoE, read about it here: https://t.co/5xGmu7l8w7

Bindu Reddy@bindureddy

7 mo

Transformer MoE Architectures - Why They Are More Efficient Mistral 8x7B MoE model is a solid 70B GPT 3.5 class model. Instead of having every part of the model work on every task, an MoE model splits the work among many specialized sub-models, or "experts." Each expert is good… https://t.co/aoDfAcfXWO https://t.co/wYVYuTE4C7

Debrieft@thedebrieft

7 mo

"Mistral AI bucks release trend by dropping torrent link to new open source LLM" — VentureBeat See the highlights of the story below! 1/10 🧵 https://t.co/ztfs1PUsV7

AI News@DailyAITechNews

7 mo

#AIRevolution: Mistral AI's Mixtral-8x7B Model Takes #SEO To New Levels - Explore Performance Metrics & Demos! #SearchEngineJournal https://t.co/V4RWQRjZ6I

Kristi Hines@kristileilani

7 mo

Learn more about Mixtral-8x7B, the new model from @MistralAI, including performance metrics, four demos to try, and what #AI says about #SEO. https://t.co/HFrQZ9QaAc

OpenRouter@OpenRouterAI

7 mo

Launching some unusual & experimental models today: 1/ 👬 Mistral: Mixtral 8x7B Chat Eight 7b models connected together into a mixture of experts, for 56b parameters in total, by @MistralAI. Launching a chat version by Fireworks, 100% discounted: https://t.co/hQFbAj3paC https://t.co/fvOfqXf9n6

Rohan Paul@rohanpaul_ai

7 mo

Mixtral-8x7B outperforms llama-2-70B as per OpenCompass model evaluations https://t.co/nfoPBo55hU

Guillermo Rauch@rauchg

7 mo

Mistral `mistral-8x7b` on the @vercel AI SDK playground is now chat-tuned and it’s. so. delightful 😍 https://t.co/NvcQHQPs9l

Matt Shumer@mattshumer_

7 mo

Announcing Mistral 8x7B-*Chat*! A very capable chat model built on top of the new Mistral MoE model, trained on the SlimOrca dataset. Download here: https://t.co/Qg1vuGm7mD

EmbeddedLLM@EmbeddedLLM

7 mo

🚀 Skip the wait for Google's Gemini and jump straight into action with @Mistral's Mixtral-8x7B. Released via torrent, no frills attached, it's ready for you to test drive in a vllm-powered interface. Experience the power of MoE AI without the wait: https://t.co/4JO9eM83lb

EmbeddedLLM@EmbeddedLLM

7 mo

Another win for open source! 🔥 @MistralAI Mixtral-8x7b is now on #vLLM 🚀 Check out our mixtral8x7b branch on https://t.co/kvUbQmk370 #mixtral #mistral #discolm https://t.co/tZw5G6U2oi

Memia 🌱/acc📈@memialabs

7 mo

Mistral AI bucks release trend by dropping torrent link to new open source LLM https://t.co/6It0FsiVpu

Fireworks AI@FireworksAI_HQ

7 mo

We released our tuned Mixtral chat a few hours ago. Play with it through our app or API: https://t.co/qkiR9W526V. Big thanks to @MistralAI ‘s new addition of this MoE model. We are very excited about it.

Shubham Saboo@Saboo_Shubham_

7 mo

Mistral 8x7B is now available in LangSmith Playground It uses the the implementation by fireworks AI team that reverse-engineered the architecture from the parameter names. This isn't an official implementation, as the model code hasn’t been released. https://t.co/XrExDMSzYq https://t.co/A4Y2q0Nwbr

Bindu Reddy@bindureddy

7 mo

The company that releases the first GPT-4 class open-source model will make history! Hailed as the biggest heroes of our time, humanity will forever remember them for having liberated AI. Yes, it's possible - Instead of an 8x7B MoE, we need an 8x70B MoE! https://t.co/w7tzbC5kEr

Rohan Paul@rohanpaul_ai

7 mo

You can try the new Mistral 8x7B model here https://t.co/pBctoKTSuV

Jared Palmer@jaredpalmer

7 mo

You can now try @MistralAI mixtral-8x7b on the @Vercel AI Playground and use it with the AI SDK. (h/t @thefireworksai for the experimental implementation) Here's a video comparing it side-by-side to GPT-3.5-Turbo and Llama 2 70b Chat https://t.co/rEWptmjmQY https://t.co/QdC6Jj0oJU

Bindu Reddy@bindureddy

7 mo

Initial evals for Mistral MoE are out, and it is a solid 70B model that is very similar to GPT 3.5, Gemini Pro, and DeepSeek and slightly better than Llama2-70B. MMLU on the base models is at 0.717 compared to Gemin Pro's 0.718, DeepSeek's ~ 0.717, and GPT 3.5 at 0.7 On other… https://t.co/iCqGUVUTg9 https://t.co/7OrJEig9OL

Matt Shumer@mattshumer_

7 mo

If you want to try the new Mistral 8x7B model, you can do so here: https://t.co/hxqqzUzjef

Matt Shumer@mattshumer_

7 mo

If you want to try the new Mistrsl 8x7B model, you can do so here: https://t.co/hxqqzUzjef

Sophia Yang, Ph.D.@sophiamyang

7 mo

What is Mixture-of-Experts (MoE)? MoE is a neural network architecture design that integrates layers of experts/models within the Transformer block. As data flows through the MoE layers, each input token is dynamically routed to a subset of the experts for computation. This… https://t.co/56mKkrHL34 https://t.co/AnYeITgHVi

Sophia Yang, Ph.D.@sophiamyang

7 mo

What is Mixture-of-Experts (MoE)? MoE is a neural network architecture design that integrates layers of experts/models within the Transformer block. As data flows through the MoE layers, each input token is dynamically routed to a subset of the experts for computation. This… https://t.co/uDvjTeuaAC https://t.co/CKNPLQahBx

Multiplatform.AI@MultiplatformAI

7 mo

Mistral AI's Unconventional Torrent-Based Release of MoE 8x7B LLM Shakes Up the AI Community #AI #AIcommunity #AItechnology #AndreessenHorowitz #artificialintelligence #EricJang #EUAIAct #Fundinground #Gemini #Google #GPT4 #JayScambler #languagemodel https://t.co/HInDfKmedc https://t.co/OHObHHZmTW

Similar Stories

Mistral AI Releases Mixtral-8x7B Model, Outperforms Existing Language Models with Performance Metrics and Demos, Sparks GPT-4 Class Model Discussions

Similar Stories

Sources

Mistral AI Releases Mixtral-8x7B Model, Outperforms Existing Language Models with Performance Metrics and Demos, Sparks GPT-4 Class Model Discussions