Progress in Developing Mixtral 8x7B Model by @akashnet

Finally. You can run Mixtral-8x7B models on free Colab or consumer desktops. A team was able to optimize inference using Mixed quantization with HQQ and MoE offloading strategy. It now fits models within combined GPU and CPU memory. Demo: https://t.co/jTl8u4569r https://t.co/Ow5oi93LDR

Soumith Chintala@soumithchintala

6 mo

gpt-fast now supports mixtral-8x7B, in addition to gpt/llama. 1000 lines of simple pytorch code blazing it out! https://t.co/crXdcNy0uv https://t.co/W1HHn0DeWM

Soumith Chintala@soumithchintala

6 mo

gpt-fast now supports mixtral, in addition to gpt/llama. 1000 lines of simple pytorch code blazing it out! https://t.co/crXdcNy0uv https://t.co/hkFrudYn63

Towards AI@towards_AI

6 mo

Run Mixtral 8x7b on Google Colab Free via #TowardsAI → https://t.co/5DCQMHrbjp

Rohan Paul@rohanpaul_ai

6 mo

Under-the-hood technique in this paper that made possible running the huge Mixtral-8x7B models in Free colab or smallish GPUs like a 3060. 🔥 Paper - "Fast Inference of Mixture-of-Experts Language Models with Offloading" 🚀 Quite a big achievement for low-resource Inferencing… https://t.co/hoGz1rqabq https://t.co/Y7tq4zeLKx

Jesse Eckel@Jesseeckel

6 mo

Pretty insane to see the progress @akashnet_ has been making. Mixtral 8x7B is supposed to be on par with GPT 3.5. Would be interesting to see how crypto could bootstrap and incentivize some of this development. Also open source AI + DePIN is something to keep an eye on. https://t.co/LGWY7R1Zga

Similar Stories

Progress in Developing Mixtral 8x7B Model by @akashnet_ Comparable to GPT 3.5, Available on Google Colab with HQQ and MoE Optimization

Similar Stories

Sources

Progress in Developing Mixtral 8x7B Model by @akashnet_ Comparable to GPT 3.5, Available on Google Colab with HQQ and MoE Optimization