
MSFT dropped bitnet 2B - trained from scratch from 4T tokens, with W1.58A8 quant 🔥 Why isn’t everyone going crazy about it?? https://t.co/BhdXxNpdiq
A newer version of this article is available. Read the latest version
6 posts • ChatGPT (GPT-4o mini)
Published
More breaking stories on DeepNewz — updated live.
MSFT dropped bitnet 2B - trained from scratch from 4T tokens, with W1.58A8 quant 🔥 Why isn’t everyone going crazy about it?? https://t.co/BhdXxNpdiq
Super cool. @MSFTResearch just released world's first natively trained 1-bit model: BitNet b1.58 2B4T. Trained on 4T tokens. It’s claimed to nearly match Qwen 2.5 1.5B in performance while being 1/6 of its size and 2x faster. → This architecture uses W1.58A8 quantization and https://t.co/jn09IDjs6W
BitNet b1.58 2B4T: First open native 1-bit LLM at 2B scale, by Microsoft. Matches full-precision peers on accuracy with 10x lower energy and 8x less memory. Trained on 4T tokens. https://t.co/lIlNbcKS5x
Super cool. @MSFTResearch just released world's first natively trained 1-bit model: BitNet b1.58 2B4T. Trained on 4T tokens. It’s claimed to nearly match Qwen 2.5 1.5B in performance while being 1/6 of its size and 2x faster. → This architecture uses W1.58A8 quantization and https://t.co/jn09IDjs6W
BitNet b1.58 2B4T: First open native 1-bit LLM at 2B scale, by Microsoft. Matches full-precision peers on accuracy with 10x lower energy and 8x less memory. Trained on 4T tokens. https://t.co/lIlNbcKS5x
Microsoft released a 2B BitNet model trained on 4T tokens performance is very similar to Qwen-2.5-1.5B BitNet is cool because it is faster and cheaper to run: https://t.co/JGAGbzdKTs https://t.co/1cofPEmqL9 https://t.co/W1QpPYeGMd
🚀 You can now use B200s on Baseten and get higher model throughput, lower latency, and better cost per token! 🚀 From benchmarks on models like DeepSeek R1, Llama 4, and Qwen, we’re already seeing: • 5x higher throughput • Over 2x better cost per token • 38% lower latency https://t.co/wBFJbDwYB0
MSFT dropped bitnet 2B - trained from scratch from 4T tokens, with W1.58A8 quant 🔥 Why isn’t everyone going crazy about it?? https://t.co/BhdXxNpdiq