Tech Enthusiasts Quantize Code Llama 70B, 13B, and 7B

Just finished uploading 4 MLX models. Quantized Code Llama 7B and 13B, Python and Instruct! Link here https://t.co/T8yag84Qgs https://t.co/dFn25QGs5H

Robert Scoble@Scobleizer

5 mo

Apple AI: 4-bit quantized means they run fast on phones and other small computers. And three different models. Big to do college essays. Small to answer you fast. Medium to find you a restaurant to eat dinner at. New Siri! https://t.co/rxlhNrHj7k

Awni Hannun@awnihannun

5 mo

4-bit quantized Code Llama models already in the 🤗 MLX Community! {70, 13, 7}B models here: https://t.co/dUgErUXnM3 1. pip install mlx-lm 2. python -m mlx_lm.generate --model mlx-community/CodeLlama-13b-Python-4bit --prompt "write a quick sort in C++" Thanks to…

ifioravanti@ivanfioravanti

5 mo

I've just quantized CodeLlama 70b Instruct to 4-bit with MLX, you can now run this model super fast on Apple Silicon. Here's the link to the model! https://t.co/rYhattJNvn

Robert Scoble@Scobleizer

5 mo

"Run this model super fast on Apple Silicon." https://t.co/lmP1tvA2RS

Pietro Schirano@skirano

5 mo

I've just quantized CodeLlama 7b Python to 4-bit with MLX, meaning you can now run this model super fast on Apple Silicon. Here's the link to the model! https://t.co/Uenhv6rNpD By the end of the day, my goal is to add all the new models. The 13B one is almost done!

Pietro Schirano@skirano

5 mo

I've just quantized CodeLlama 7b to 4-bit with MLX, meaning you can now run this model super fast on Apple Silicon. Here's the link to the model! https://t.co/Uenhv6rNpD By the end of the day, my goal is to add all the new models. The 13B one is almost done!

Dr. Tristan Behrens@DrTBehrens

5 mo

This is how fast @ollama hosted Code Llama 70B writes the game Snake in Python. Probably the biggest Language Model, I ever ran on my MacBook Pro! What a beast! https://t.co/m21nLnLKyW

Similar Stories

Similar Stories

Tech Enthusiasts Quantize Code Llama 70B, 13B, and 7B Models to 4-Bit for Faster Performance on Apple Silicon

Similar Stories

Sources

Tech Enthusiasts Quantize Code Llama 70B, 13B, and 7B Models to 4-Bit for Faster Performance on Apple Silicon