GPT-4, a highly advanced language model developed by OpenAI, has proven to be a formidable force in the field of artificial intelligence. Despite efforts by various entities like Gemini, Gemini 1.5, Mixtral, and Claude, GPT-4 remains unbeaten, with only Claude managing to slightly outperform it. Observations suggest that the industry may be approaching a plateau in terms of AI model capabilities, as even major players like Google and Anthropic have failed to surpass GPT-4. Recent evaluations indicate that the supremacy of GPT-4 is waning, with newer models like 'mistral'-large emerging as strong contenders alongside OpenAI and Anthropic.
Side by Side eval of latest publicly available models on MATH using the exact same setup for all models 3 months apart (Dec 2023 vs March 2024). The supremacy of GPT-4 is over. It was about time (end of training of GPT-4, Summer 2022) :) https://t.co/oSF54HA45k https://t.co/IXiN0xHthN
Interesting how the picture changed in only 3 months (this is an eval of latest publicly available models on MATH using the exact same setup for all models) Now there's 2 labs at the top (OpenAI and Anthropic). `mistral`-large is a close contender and above `claude-3-sonnet`. https://t.co/1GMzfjLYvM https://t.co/zY9NWCfdND
To mark the anniversary of GPT-4's launch, let's all remember that systems, not just base models, matter--GPT-4 has gotten more useful over time with better fine-tuning, tool use, UX, etc. AI policy folks risk "winning the last war" by not taking systems seriously enough.
🎉 Happy 1st Birthday, GPT-4! Ten birthday observations: • We might be reaching a plateau in terms of sheer capability. Nobody has been able to beat it decisively. Places like Google and Anthropic have put in a lot of money trying. None succeeded; instead there tentatively…
AI is moving so fast! Also: it’s been a year and we are still basically stuck at GPT-4 level models. 🤔
One year since GPT-4 deployment: From GPT-1 and 2 establishing the language model paradigm, through GPT-3's scaling predictions, to GPT-4 showing how complex systems emerge, mimicking nature’s unpredictable patterns from simple elements. An exploration from observation to deep,…
Gemini, Gemini 1.5, Mixtral and Claude have all tried to beat GPT-4 No one has succeeded except tor Claude, which inches it out. I suspect OpenAI themselves can’t bet GPT-4 significantly. It becomes harder and harder to squeeze performance from ML models anyways, so this is…