The ARC-AGI challenge, a benchmark for evaluating artificial general intelligence (AGI) proposed by François Chollet, has seen significant progress. Jack Cole's team recently achieved a new high score of 38% on the private Kaggle leaderboard, surpassing the previous record of 34%. The challenge, which includes a $1 million prize pool, aims to reach 85% accuracy to be considered solved. Notable contributions include a 50% score on the public test set by Redwood Research using GPT-4o and a 60% score on the public test set by a heterogenous ensemble. Ryan and his coworker achieved state-of-the-art (SOTA) performance on the train set. The ARC-AGI benchmark remains a critical measure of human-like general fluid intelligence, with ongoing efforts to improve performance through innovative methods and computational techniques.
In just one week, the 2023 ARC Prize high score has officially been broken on the private Kaggle leaderboard. Now at 38%. Congrats @Jcole75Cole and team! (And we are working with @RyanPGreenblatt to validate his high score for the separate public leaderboard) https://t.co/bZSHi1BIHB
New ARC-AGI SOTA solution - 38% @Jcole75Cole and team absolutely crushing it I *believe* this is using active inference (test time fine-tuning) Reminder that prizes require that solutions be open sourced by the end of the competition https://t.co/DXSvoDuTW1 https://t.co/nGZkrdP9Ud
New ARC-AGI high score! 38% (Prize goal: 85%) Congratulations, MindsAI! https://t.co/iFTeEVaGci
New ARC-AGI high score! 38% (Prize goal: 85%) https://t.co/T3oyURQjnq
This was an interesting conversation with François Chollet, an AI researcher at Google: https://t.co/SChyJUK87B. He launched a million dollar prize to solve the ARC benchmark created to measure a human-like form of general fluid intelligence.
We interviewed the current winners of the ARC challenge from @fchollet - Jack Cole @Jcole75Cole Mohamed Osman and Michael Hodel @bayesilicon - and commented on the new 50% result from @RedwoodResearch - just dropped on MLST https://t.co/nuSE6OaHuG
SotA on the public evaluation set of ARC-AGI is ~60% for a heterogenous ensemble, 54% for a single approach. But what really matters is the private leaderboard on Kaggle (33% there so far vs 85% needed to beat the challenge) https://t.co/AcrSVTbnyr
This is very interesting work. But if we are going to play the SoTA game, our solution that first broke SoTA on the private test set (34%) for ARC scored 60% on the public test set back in February, 2024. You wouldn't want us to run it on the training dataset like Ryan did. https://t.co/0bCftZHmT7
inb4 OpenAI drops Q* and gets 100% on ARC https://t.co/PMOCCQ2diJ
Researcher achieves state of the art on ARC AGI, a $1M prize on a logical geometric reasoning benchmark. Prior best was 34%. He got 72% on train set and 50% on the harder public test set. Some say he made 4o brute force 8000 programs but there are a lot more cool hacks.. 1/7 https://t.co/E4VRtyyFZh
ARC-AGI remains unbeaten 5 years after @fchollet proposed it. Discussions have shifted from defining intelligence to AGI doomsday scenarios. To refocus, @fchollet launched the @arcprize with a $1M+ prize pool! We summarize last week's updates here: https://t.co/Jw2mxTCFfY
Just a week ago @fchollet announced the launch of the ARC Prize to work on AGI. It's a $1,000,000+ prize pool competition to beat and open-source a solution to the ARC-AGI eval. What's ARC-AGI eval? We discussed it here: https://t.co/J6Yg1SoOaB https://t.co/5nz4TxAbXz
Getting 50% (SoTA) on ARC-AGI with GPT-4o https://t.co/hstJzFES7C https://t.co/FPxLjNQZmw
Congrats to the winners and everyone who took part in GAIC Maths 2024! Thanks to @AGI_Odyssey for putting on a great global event that showed how AI can solve complex math problems. The @NetMindAI team extends our gratitude and full support for future AGI Odyssey events. We're… https://t.co/hQ9fe6jky3
50% on ARC-AGI with GPT-4o This wonderful blog post brings out another point that I didn't explicitly mention in my blog -- ARC-AGI gets solved with a bunch of very clever tricks around existing models, and more search compute. https://t.co/YvoT4PC3yz https://t.co/CeXqixsbSF
High score on @arcprize is now 33% from Jack Cole's team He holds the SOTA record of 34% Once we get above 34% we are into new territory https://t.co/cX5wUJxsY6
I asked Buck about his thoughts on ARC-AGI to prepare for interviewing @fchollet. He tells his coworker Ryan, and within 6 days they've beat SOTA on ARC and are on the heels of average human performance. 🤯 "On a held-out subset of the train set, where humans get 85% accuracy,… https://t.co/MmlLe2Miuh https://t.co/FemPb1cb19