Apple Unveils MLXServer by Mustafa and Siddharth, Upda

(Q)LoRA in MLX LM is also faster and more memory efficient thanks to: - compilation - better data packing - gradient checkpointing pip install -U mlx-lm Fine-tuning 4-bit Mistral 7B on an 8GB (!) M1 is actually quite doable: https://t.co/WhFBDQLDHi

Awni Hannun@awnihannun

4 mo

(Q)LoRA in MLX LM is a lot more flexible now: tune layers, rank, scale, and more. pip install -U mlx-lm Example config: https://t.co/0SzXyddDdb Thanks to Chimezie https://t.co/3EPLGxAys9 for the addition! https://t.co/XH0wVQgmiN

Awni Hannun@awnihannun

4 mo

MLX Swift is updated with fused attention (from @argmaxinc) and fast quantized kernels. LLM example here: https://t.co/Qjo1DWwqfI A 4-bit Mistral 7B runs quite fast for thousands of tokens on my M1: https://t.co/zi3uxJxE3C

Awni Hannun@awnihannun

4 mo

MLX Swift is updated with fused attention (from @argmaxinc) and fast quantized kernels. LLM example here: https://t.co/Qjo1DWwqfI A 4-bit Mistral 7B runs quite fast for thousands of token on my M1: https://t.co/rgPx3JF2Dm

Percy Liang@percyliang

4 mo

Levanter has LoRA support! Now you can do lightweight fine-tuning in a fully reproducible way on GPU or TPU. https://t.co/Zf4jcQRRZD

Jaward@Jaykef_

4 mo

You gotta love what @apple’s mlx team cooked: - A unified memory model that literally does compute-magic: parallel operations with automatic dependency insertions. - Supports off-the-shelf use of all the fun stuff in composable func transformations (differentiation,… https://t.co/fxz6CEoi9H

Karthik Kannan@meTheKarthik

4 mo

Apple is going all out with MLX. a few days ago they rereleased MLX with Swift so you can run LLMs locally. now they’re onto MLXServer so you can build APIs around them more easily. solid TF/Pytorch competitor in the making. https://t.co/8auXfSYvax

Awni Hannun@awnihannun

4 mo

Exciting new project: MLXServer An easy way to get started with LLMs locally. HTTP endpoints for text generation, chat, converting models, and more. Setup: pip install mlxserver Docs: https://t.co/mLCWxUdcec Example: https://t.co/DEQLHOSAZp https://t.co/zSMgfoIGz1

Siddharth Sharma@siddrrsh

4 mo

Mustafa (@maxaljadery) and I are excited to announce MLXserver: a Python endpoint for downloading and performing inference with open-source models optimized for Apple metal ⚙️ Docs: https://t.co/69nBje4BJk https://t.co/vnLtMSJYtL

Similar Stories

Apple Unveils MLXServer by Mustafa and Siddharth, Updates MLX Swift with New Features

Similar Stories

Sources

Apple Unveils MLXServer by Mustafa and Siddharth, Updates MLX Swift with New Features