Mamba AI Model in 300 Lines Rivals Transformers; Gains

Tanishq Mathew Abraham, Ph.D.@iScienceLuvr

Are you wondering how the new Mamba language model works? Mamba is based on state-space models (SSMs), a new competitor to the Transformer architecture. Here are 5 resources to help you learn about SSMs & Mamba! ↓↓↓

Hailey Schoelkopf@haileysch__

6 mo

Support for Mamba has landed in lm-evaluation-harness! Use it with `--model mamba_ssm` : https://t.co/4dTLhmWbtP Was really happy to see @_albertgu @tri_dao provide support for our new release natively alongside their architecture code, to benchmark against Pythia reproducibly! https://t.co/Or7NurSD5n

Far El@far__el

6 mo

Nice job implementing Mamba in 300 lines of code!! https://t.co/IiNYibEHLQ

oxen@oxen_ai

6 mo

Today we're continuing with Mamba! - How to train Mamba on your own data! See you in the Practical ML Dive soon https://t.co/3fbdg1DLAL https://t.co/nBDMjWpOMC

John (Zhiyao) Ma@johnma2006

7 mo

Minimal implementation of Mamba, the new LLM architecture from @_albertgu and @tri_dao, in one file of PyTorch https://t.co/SeoDcakm6V

Similar Stories

Mamba AI Model in 300 Lines Rivals Transformers; Gains Benchmarking Support, Learning Resources

Similar Stories

Sources

Mamba AI Model in 300 Lines Rivals Transformers; Gains Benchmarking Support, Learning Resources