OpenAI's GPT-2 (124M) model, released in 2019, has become more accessible due to reduced compute costs. It can now be replicated in 90 minutes for $20 using llm.c. The cost to train GPT-5 is estimated to range from $1.7 to $2.5 billion, significantly more expensive than its predecessors. Additionally, a new SOTA open-source VLM model, Llama3-V, has been introduced, outperforming previous models like LLaVA and offering comparable performance to GPT4-V and Gemini Ultra with a significantly smaller model size.
K2-65B🏔️, the most performant fully-open LLM released to date. As a blueprint for open-source AGI, they release all model checkpoints, code, logs, and data. About K2: 🧠65 billion parameters 🪟Fully transparent & reproducible 🔓Apache 2.0 📈Outperforms Llama 2 70B https://t.co/2XGw7YoncF
Our cofounder @Yuchenj_UW managed to train GPT-2 using @karpathy’s llm.c framework in just 27 minutes for under $10. Kudos to @karpathy’s contribution to open source AI. The future of AI is collaborative 🤘 https://t.co/rsKf872qM2
Llama 3-V: Close to matching GPT4-V with a 100x smaller model and 500 dollars https://t.co/9o7uneOphM
Fantastic work by the @llm360 team, fully transparent open large language model that sits between LLaMA 2 70b and LLaMA 3 but with far less training flops. Intermediate checkpoints, code, datasets all included 👍 https://t.co/doCvtX2cQH
Another day has passed, and I managed to train GPT-2 (124M) using @karpathy's llm.c in just 27 minutes with 8 x H100 GPUs for under $10. All you need is to adjust the learning rate (LR). The original maximum learning rate after warmup in the repo was set to 0.0006 (following the… https://t.co/caTeMZseQf https://t.co/7pe0wnhS5d
Congratulations to @LLM360 on the release of K2 and pushing the boundaries of open-source LLMs – bravo 👏👏! Our @SnowflakeDB AI Research team is proud to collaborate with communities like @LLM360 to keep advancing AI that is transparent and truly open. https://t.co/L5CviKOsid
Please welcome K2-65B🏔️, the most performant fully-open LLM released to date. As a blueprint for open-source AGI, we release all model checkpoints, code, logs, and data. About K2: 🧠65 billion parameters 🪟Fully transparent & reproducible 🔓Apache 2.0 📈Outperforms Llama 2 70B https://t.co/MBk4R7lq8K
Training GPT-2 in even less time (50 minutes) with 8 H100s for even less money means a 3k-fold cost reduction in about 5 years. The original GPT-2 was trained (in 2019) for several weeks, surely not the net value, but think about it: now it takes less than an hour with less… https://t.co/gCgwMpk4Mo
LLama3 8B Vision - an open-source vision Model that is very close to GPT4V & GPT4o😯 A couple of architecture changes: SIGLIP vs CLIP which significantly outperforms + training a really good projection layer from SIGLIP to the LLama3 embedding. Then just strong data curation and… https://t.co/T1ThghIDXT
LLama3 8B Vision - an open-source vision Model that is almost on par with GPT4V & GPT4o😯 https://t.co/akRfMFUc6a
Llama 3-V: Matching GPT4-V with a 100x smaller model and 500 dollars by Aksh Garg Llama 3-V is the first ever multi modal model built on Llama 3! https://t.co/SD2o4oDzvH https://t.co/dQLhwWJQGt
Some talented Stanford students trained a vision language model (Llama 3 8B + Siglip) that performs on par with GPT-4 Vision and Opus. For $500! Open Source blogpost + weights 👇 (note, always take benchmarks with a grain of salt) https://t.co/66k2iiCDlM
I trained GPT-2 (124M) using @karpathy's llm.c in just 43 minutes with 8 x H100 GPUs. This is 2.1x faster than the 90 minutes it took with 8 x A100 GPUs. Currently, the cost of renting an H100 GPU is around $2.50/hr (under 1-year commitment), which reduces the training cost for… https://t.co/NOK7poiozk https://t.co/DASUg5czxj
OpenAi’s expected cost to train the newest GPT-5 model could range from $1.7 to $2.5 billion This would be could be ~17x more expensive than GPT-4 almost 400x more expensive than GPT-3 👀 https://t.co/XqmpQNunyO
SOTA open source VLM that performs as good as GPT4V and Claude Opus using LLama3 8B! https://t.co/QnD2h3c1GH
Introducing Llama3-V, a SOTA open-source VLM model We feature: • Outperforms LLaVA • Comparable performance to GPT4-V, Gemini Ultra, Claude Opus with a 100x smaller model • SOTA open source VLM for Llama3 8B Check us out on: • 🤗: https://t.co/ur920NHIz9 • Github:… https://t.co/gTLfEG5BlS
How to replicate GPT-2 in 90 minutes for $20! https://t.co/JXLSUWjSH6
When GPT-2 came out in 2019, it was a frontier language model. People within OpenAI and elsewhere talked about the potential for danger and misuse. 124 million parameters seemed massive. Today, the cost of compute has come down so much that it can be trained on a single GPU for… https://t.co/cMgCvw1lVa
# Reproduce GPT-2 (124M) in llm.c in 90 minutes for $20 ✨ The GPT-2 (124M) is the smallest model in the GPT-2 series released by OpenAI in 2019, and is actually quite accessible today, even for the GPU poor. For example, with llm.c you can now reproduce this model on one 8X… https://t.co/C9GdaxGPhd
MiniCPM-Llama3-V 2.5 is now available on the Clarifai Platform! 🎉 MiniCPM-Llama3-V 2.5 is a high-performance, efficient 8B parameter multimodal model excelling in OCR, multilingual support, and multimodal tasks. Here are some key capabilities of the model: • Leading… https://t.co/rj8N8X5x7G