The TinyLlama project has trained a 1.1B Llama model on 3 trillion tokens using 16 A100-40G GPUs and aims to achieve this within 90 days. It is compatible with π€ Transformers.js and has the Llama 2 architecture, allowing it to run fast with little memory and compute requirements. The model can now chat and hold a conversation, showing potential for specific tasks. The latest version of mlx_llm supports TinyLLaMA and Phi2 models, enabling them to run on an 8GB MacBook with Apple Silicon.
Just released a new version of mlx_llm with support to TinyLLaMA and Phi2 models π They can run on an 8GB MacBook with Apple Silicon, so you can have a chat with them without a Pro model. Another step towards local computing π https://t.co/2UPzWFnQsK #mlx #apple #llm
TinyLlama is amazing! I have been waiting on a <3B permissive model to come out. Fine tuning small LLMs to do very specific tasks has so much potential. I loaded it up in my prompt upsampler and it works shockingly well. 𧡠https://t.co/Oj5gKPcACu
tinyllama is a 1.1B parameter model trained on 3T tokens. it now knows how to chat and can hold a conversation. model links and more... π https://t.co/EaCkfI4Bqv
Just using MLX to fine-tune TinyLlama with LoRA locally on a 8 GB Mac Mini. Code: https://t.co/BCQZAWHCTA That's 1.1B parameter TinyLlama which just finished training on 3T tokens. Happy new year! Looking forward to more Local LLMs in 2024 https://t.co/kACMRZ6Suw
TinyLlama is a 1.1B model with the Llama 2 architecture, trained on 3 trillion tokens. Its small size means it can run fast with little memory and compute requirements. https://t.co/yuyAYhZMMh
TinyLlama is finally here: a 1.1B Llama model trained on 3 trillion tokens! π€― It's also compatible with π€ Transformers.js (see code below)! π What a way to end the year! π₯³ π https://t.co/ILbuqqaIGV https://t.co/KnC23RUUVD
π₯π¦ The TinyLlama project can be a game-changer - its currently pretraining a 1.1B Llama model on 3 trillion tokens. The team aim to achieve this within a span of "just" 90 days using 16 A100-40G GPUs ππ. The training has started on 2023-09-01. π₯π¦ Now, overall, if a modelβ¦ https://t.co/0Isdk5SIBZ