GPT-4 Surpasses Humans in Theory of Mind Tasks, Shows

🧠"Thinking at a Distance" in the Age of AI LLMs, with their vast corpora and speed, redefine the essence of cognition. The extraordinary rise of large language models (LLMs) has exposed a curious split between human and artificial intelligence when it comes to processing… https://t.co/vT1AkHkfyf

Temerty Centre for AI in Medicine (T-CAIREM)@UofT_TCAIREM

1 mo

Using #ChatGPT in the Development of Clinical Reasoning Cases: A Qualitative Study https://t.co/PrUhrGZJky

Ethan Mollick@emollick

1 mo

LLMs “intelligence” is hard to benchmark, as we don’t have good benchmarks for human performance at complex tasks. Take theory-of-mind: several tests found GPT-4 beats humans, but another one finds a huge gap. Is it the testing structure? Prompting? Which is right? Hard to know. https://t.co/z9L3stRCDP

Dave Burstein@AInews_wire

1 mo

"GPT-4 exhibits higher-order ToM at the level of adult humans. The best-performing LLMs have a capacity for ToM. Given the role that ToM plays in a wide range of behaviours, significant implications" Winnie Street & Google team https://t.co/w2MbVJCZp8 #AIIntelligence

Hyunwoo Kim@hyunw__kim

1 mo

⁉️ Let's check how GPT-4o, Gemini, Llama3, Mixtral, and Claude perform on theory of mind, shall we?🌟We report new results on Benchmark FANToM👻 - GPT-4o tops the chart by finally achieving score of 2.0/100 (vs. Human 87.5) - Huge boost for Gemini-1.5-flash compared to… https://t.co/rtuNsfEyIF https://t.co/GZWidSdIYs

AI Notkilleveryoneism Memes ⏸️@AISafetyMemes

1 mo

GPT-4 just surpassed adult human performance at THEORY OF MIND tasks (basically, mind reading) And this is just GPT-4! Imagine GPT-5... and, if we're still here to see it, GPT-6. Soon, AIs will think we're as slow as plants. PLANTS. And… um… most people don't know this, but… https://t.co/zE0GOrCtsD https://t.co/QDkQ3ETBGt

Andrew Curran@AndrewCurran_

1 mo

GPT-4 demonstrated superior performance over humans in 6th Order ToM inferences, suggesting that increased model size, instruction fine-tuning, multimodal capabilities, and word comprehension - or interplay among all of these - contribute to its ability to model mental states. https://t.co/k1B0veyiog https://t.co/SR8JOzFBDo

Burny — Effective Omni@burny_tech

1 mo

Testing theory of mind in large language models and humans GPT-4 models performed at, or even sometimes above, human levels at identifying indirect requests, false beliefs and misdirection, but struggled with detecting faux pas. https://t.co/djwvSQ2XoX

Similar Stories

GPT-4 Surpasses Humans in Theory of Mind Tasks, Shows Inconsistent Benchmarking

Similar Stories

Sources

GPT-4 Surpasses Humans in Theory of Mind Tasks, Shows Inconsistent Benchmarking