Search

Search

Business Crypto Culture Environment Politics Science Sports Tech Video Games World

AI AR-VR Fintech Infosec IoT Metaverse Mobile Policy Robotics Smart Home Social Software Startups Wearables

Similar Stories

Similar Stories

Footer

Business

Economics
Real Estate
VC

Crypto

Airdrops
Blockchains
DeFi
Hacks
Markets
Memecoin
Mining
NFT
Regulation

Culture

Celebrities
Crime
Education
Movies
Music
Obituary
TV

Environment

Climate
Energy
Natural Disasters
Natural Resources
Sustainability

Politics

Arizona
Boston
California
Chicago
Colorado
Detroit
Florida
Georgia
LA
Las Vegas
Los Angeles
New Jersey
New Mexico
New York
Ohio
Oregon
Philadelphia
San Francisco
Seattle
SF
Texas
Utah
Washington DC

Science

Bio
Health

Sports

Boxing
Chess
Golf
Hockey
MLB
NBA
NCAA
NFL
PGA
Poker
Racing
Rugby
Soccer
Tennis
UFC

Tech

AI
AR-VR
Fintech
Infosec
IoT
Metaverse
Mobile
Policy
Robotics
Smart Home
Social
Software
Startups
Wearables

Video Games

Esports
Releases

World

Africa
Asia
Australia
Brazil
Britain
Canada
China
Europe
France
Germany
Hong Kong
India
Israel
Italy
Japan
Latin America
Mexico
Middle East
North Korea
Pakistan
Poland
Russia
South America
Spain
Turkey
Ukraine
United States
US
USA

WhatsApp YouTube X

© 2024 DeepNFTValue, Inc. All rights reserved.

Similar Stories

Tiny LLM Llama8B Outperforms GPT-4 with 96.7% on GSM8K Math Benchmark, 200x Fewer Parameters
Authors
4
14 days
AI
Tech
New Claude 3.5 Sonnet AI Model Surpasses GPT-4o in Performance Metrics, More Than Twice as Cost-Effective
Authors
5
9 days
AI
Tech
Claude 3.5 Sonnet Launches on June 20, Outperforms GPT-4o and Claude 3 Opus
Authors
5
10 days
AI
Tech
Nvidia, Alibaba Group, StabilityAI Launch Nemotron-4 340B for $4.20 per M token
Authors
5
7 days
China
Economics
AI
Fireworks AI Unveils Firefunction-v2, Tool-Calling Model Competing with GPT4-o at 2.5x the Speed, Cost-Effective
Authors
9
11 days
AI
Tech
01AI_Yi and FireworksAI_HQ Launch Yi-Large Model with 32k Context, Joining Nvidia
Authors
5
6 days
AI
Politics
Business

Sources

Loading...

Similar Stories

Tiny LLM Llama8B Outperforms GPT-4 with 96.7% on GSM8K Math Benchmark, 200x Fewer Parameters
Authors
4
14 days
AI
Tech
New Claude 3.5 Sonnet AI Model Surpasses GPT-4o in Performance Metrics, More Than Twice as Cost-Effective
Authors
5
9 days
AI
Tech
Claude 3.5 Sonnet Launches on June 20, Outperforms GPT-4o and Claude 3 Opus
Authors
5
10 days
AI
Tech
Nvidia, Alibaba Group, StabilityAI Launch Nemotron-4 340B for $4.20 per M token
Authors
5
7 days
China
Economics
AI
Fireworks AI Unveils Firefunction-v2, Tool-Calling Model Competing with GPT4-o at 2.5x the Speed, Cost-Effective
Authors
9
11 days
AI
Tech
01AI_Yi and FireworksAI_HQ Launch Yi-Large Model with 32k Context, Joining Nvidia
Authors
5
6 days
AI
Politics
Business

May 29, 01:03 AM

OpenAI's GPT-2 (124M) Now Replicable in 90 Minutes for $20; GPT-5 Training Costs Estimated at $1.7-2.5 Billion. New Llama3-V SOTA Model Outperforms Previous VLMs.

OpenAI's GPT-2 (124M) Now Replicable in 90 Minutes for $20; GPT-5 Training Costs Estimated at $1.7-2.5 Billion. New Llama3-V SOTA Model Outperforms Previous VLMs.

Authors

16

OpenAI's GPT-2 (124M) model, released in 2019, has become more accessible due to reduced compute costs. It can now be replicated in 90 minutes for $20 using llm.c. The cost to train GPT-5 is estimated to range from $1.7 to $2.5 billion, significantly more expensive than its predecessors. Additionally, a new SOTA open-source VLM model, Llama3-V, has been introduced, outperforming previous models like LLaVA and offering comparable performance to GPT4-V and Gemini Ultra with a significantly smaller model size.

#OpenAI #GPT #LLaVA #Gemini Ultra

Written with ChatGPT (GPT-3).

Sambhav Gupta@sambhavgupta6
1 mo
K2-65B🏔️, the most performant fully-open LLM released to date. As a blueprint for open-source AGI, they release all model checkpoints, code, logs, and data. About K2: 🧠65 billion parameters 🪟Fully transparent & reproducible 🔓Apache 2.0 📈Outperforms Llama 2 70B https://t.co/2XGw7YoncF
Hyperbolic@hyperbolic_labs
1 mo
Our cofounder @Yuchenj_UW managed to train GPT-2 using @karpathy’s llm.c framework in just 27 minutes for under $10. Kudos to @karpathy’s contribution to open source AI. The future of AI is collaborative 🤘 https://t.co/rsKf872qM2
Rohan Paul@rohanpaul_ai
1 mo
Llama 3-V: Close to matching GPT4-V with a 100x smaller model and 500 dollars https://t.co/9o7uneOphM
Emad@EMostaque
1 mo
Fantastic work by the @llm360 team, fully transparent open large language model that sits between LLaMA 2 70b and LLaMA 3 but with far less training flops. Intermediate checkpoints, code, datasets all included 👍 https://t.co/doCvtX2cQH
Yuchen Jin@Yuchenj_UW
1 mo
Another day has passed, and I managed to train GPT-2 (124M) using @karpathy's llm.c in just 27 minutes with 8 x H100 GPUs for under $10. All you need is to adjust the learning rate (LR). The original maximum learning rate after warmup in the repo was set to 0.0006 (following the… https://t.co/caTeMZseQf https://t.co/7pe0wnhS5d
Snowflake@SnowflakeDB
1 mo
Congratulations to @LLM360 on the release of K2 and pushing the boundaries of open-source LLMs – bravo 👏👏! Our @SnowflakeDB AI Research team is proud to collaborate with communities like @LLM360 to keep advancing AI that is transparent and truly open. https://t.co/L5CviKOsid
LLM360@llm360
1 mo
Please welcome K2-65B🏔️, the most performant fully-open LLM released to date. As a blueprint for open-source AGI, we release all model checkpoints, code, logs, and data. About K2: 🧠65 billion parameters 🪟Fully transparent & reproducible 🔓Apache 2.0 📈Outperforms Llama 2 70B https://t.co/MBk4R7lq8K
Jürgen R Plasser@__thetaphipsi
1 mo
Training GPT-2 in even less time (50 minutes) with 8 H100s for even less money means a 3k-fold cost reduction in about 5 years. The original GPT-2 was trained (in 2019) for several weeks, surely not the net value, but think about it: now it takes less than an hour with less… https://t.co/gCgwMpk4Mo
Rohan Paul@rohanpaul_ai
1 mo
LLama3 8B Vision - an open-source vision Model that is very close to GPT4V & GPT4o😯 A couple of architecture changes: SIGLIP vs CLIP which significantly outperforms + training a really good projection layer from SIGLIP to the LLama3 embedding. Then just strong data curation and… https://t.co/T1ThghIDXT
Rohan Paul@rohanpaul_ai
1 mo
LLama3 8B Vision - an open-source vision Model that is almost on par with GPT4V & GPT4o😯 https://t.co/akRfMFUc6a
Whole Mars Catalog@WholeMarsBlog
1 mo
Llama 3-V: Matching GPT4-V with a 100x smaller model and 500 dollars by Aksh Garg Llama 3-V is the first ever multi modal model built on Llama 3! https://t.co/SD2o4oDzvH https://t.co/dQLhwWJQGt
andrew gao@itsandrewgao
1 mo
Some talented Stanford students trained a vision language model (Llama 3 8B + Siglip) that performs on par with GPT-4 Vision and Opus. For $500! Open Source blogpost + weights 👇 (note, always take benchmarks with a grain of salt) https://t.co/66k2iiCDlM
Yuchen Jin@Yuchenj_UW
1 mo
I trained GPT-2 (124M) using @karpathy's llm.c in just 43 minutes with 8 x H100 GPUs. This is 2.1x faster than the 90 minutes it took with 8 x A100 GPUs. Currently, the cost of renting an H100 GPU is around $2.50/hr (under 1-year commitment), which reduces the training cost for… https://t.co/NOK7poiozk https://t.co/DASUg5czxj
Cheddar Flow@CheddarFlow
1 mo
OpenAi’s expected cost to train the newest GPT-5 model could range from $1.7 to $2.5 billion This would be could be ~17x more expensive than GPT-4 almost 400x more expensive than GPT-3 👀 https://t.co/XqmpQNunyO
Mustafa Aljadery@mustafaaljadery
1 mo
SOTA open source VLM that performs as good as GPT4V and Claude Opus using LLama3 8B! https://t.co/QnD2h3c1GH
Siddharth Sharma@siddrrsh
1 mo
Introducing Llama3-V, a SOTA open-source VLM model We feature: • Outperforms LLaVA • Comparable performance to GPT4-V, Gemini Ultra, Claude Opus with a 100x smaller model • SOTA open source VLM for Llama3 8B Check us out on: • 🤗: https://t.co/ur920NHIz9 • Github:… https://t.co/gTLfEG5BlS
Whole Mars Catalog@WholeMarsBlog
1 mo
How to replicate GPT-2 in 90 minutes for $20! https://t.co/JXLSUWjSH6
Dean W. Ball at Reindustrialize 6/24-26@deanwball
1 mo
When GPT-2 came out in 2019, it was a frontier language model. People within OpenAI and elsewhere talked about the potential for danger and misuse. 124 million parameters seemed massive. Today, the cost of compute has come down so much that it can be trained on a single GPU for… https://t.co/cMgCvw1lVa
Andrej Karpathy@karpathy
1 mo
# Reproduce GPT-2 (124M) in llm.c in 90 minutes for $20 ✨ The GPT-2 (124M) is the smallest model in the GPT-2 series released by OpenAI in 2019, and is actually quite accessible today, even for the GPU poor. For example, with llm.c you can now reproduce this model on one 8X… https://t.co/C9GdaxGPhd
Clarifai@clarifai
1 mo
MiniCPM-Llama3-V 2.5 is now available on the Clarifai Platform! 🎉 MiniCPM-Llama3-V 2.5 is a high-performance, efficient 8B parameter multimodal model excelling in OCR, multilingual support, and multimodal tasks. Here are some key capabilities of the model: • Leading… https://t.co/rj8N8X5x7G

AI/Modeling AI/ChatGPT Features AI/New Products AI/Fundraising