Search
News Chat
Login
Search
Top
For You
Business
cbdcs
Crypto
Culture
Environment
Politics
Science
Sports
Tech
Video Games
World
Sonnet News
Top stories
Claude 3.5 Achieves 40% on SWE-Bench Lite, Outperforms GPT and Gemini Pro
Authors
5
2 days
AI
Software
Tech
Prediction markets for Sonnet
Prediction markets for Sonnet
Where will Anthropic's Claude 3.5 Sonnet model rank on LMSys Chatbot Arena on July 7th? Ahead of GPT, Gemini?
Jun 21, 4:48 AM
Jul 8, 3:59 AM
108
48538
Is Claude 3.5 Sonnet a distilled or quantized version of a larger model?
Jun 20, 2:30 PM
Jan 1, 4:59 AM
50.59%
chance
3
11
Option
Votes
NO
YES
101
99
What will be the score of Claude 3.5 sonnet in the LMSYS Chatbot Arena at the end of July 2024?
Jun 21, 11:31 PM
Aug 1, 9:59 PM
28
4824
Will Claude 3.5 Sonnet take the #1 position for any period of time in its first month on the LMSYS leaderboard?
Jun 20, 2:53 PM
Jan 1, 4:59 AM
27.06%
chance
13
1180
Option
Votes
YES
NO
164
61
Will at least 8 of Stephen Casper's 10 accomplishments for SAEs happen by 5/26/25?
May 26, 6:43 PM
May 28, 3:59 AM
16.35%
chance
10
1611
Option
Votes
YES
NO
2262
442
Will Zvi use Claude 3.5 for the majority of his LLM chats each month through October 2024?
Jun 24, 11:18 AM
Nov 2, 3:59 AM
63.33%
chance
24
3156
Option
Votes
NO
YES
1314
761
Will the product of the version numbers of all major AI language models released in 2024 be greater than 1000?
Mar 10, 8:42 PM
Jan 1, 7:59 AM
88.21%
chance
29
4068
Option
Votes
NO
YES
2401
242
Which of Sabine Hossenfelder's predictions from "What will they think about us in 2085?" will be right?
Jun 30, 2:34 AM
Dec 31, 1:29 PM
10
1346
Does Claude 3.5 have control vector(s) to increase its capabilities?
Jun 22, 5:36 PM
Jan 1, 4:59 AM
35.05%
chance
16
1300
Option
Votes
YES
NO
136
73
In which quarter will Claude 4 be released?
Jun 24, 12:56 PM
Apr 2, 3:59 AM
8
912
When will Claude 3.5 Opus be released?
Jun 24, 12:33 PM
Dec 2, 4:59 AM
23
2087
Articles
Latest stories
Claude 3.5 Achieves 40% on SWE-Bench Lite, Outperforms GPT and Gemini Pro
Authors
5
2 days
AI
Software
Tech
AnthropicAI Advances in AI Interpretability with New 'Brain Scan' Technique, Showcases Theory of Mind in LLMs
Authors
8
1 month
AI
Tech
Groq Inc.'s Llama 3 AI Hits 290 T/s for 70B Model, 876 T/s for 8B
Authors
19
2 months
AI
Tech
Amazon Bedrock Adds Anthropic's Claude 3 Opus AI Model with Exceptional Capabilities, Sonnet, and Haiku
Authors
7
2 months
AI
Tech
Amazon Bedrock Introduces Anthropic's Claude 3 Opus AI Model
Authors
13
3 months
AI
Tech
AnthropicAI Unveils Claude 3 AI Models: Haiku (smallest, fastest), Opus (balanced), Sonnet (largest, most capable)
Authors
5
3 months
AI
Tech
Claude AI Models Outperform OpenAI's GPT-4 in Various Evaluations, Claude 3 Opus Leads Human Evaluation Leaderboard
Authors
24
3 months
AI
Tech
Cursor Launches Claude Opus and Sonnet Support, Copilot++ for $20/Month
Authors
4
3 months
AI
Tech
AI Models GPT-4, Claude Opus, Sonnet, and Gemini Pro Pass Mirror Test of Self-Awareness; Copilot Fails
Authors
8
3 months
AI
Tech
AnthropicAI Launches Claude 3 Models on Google Cloud's Vertex AI and Poe for AI Chatbots
Authors
10
3 months
AI
Tech
Anthropic's Claude 3, Backed by Google and Amazon, Rivals GPT-4 with Near-Human Abilities and Self-Awareness
Authors
69
4 months
AI
Tech
Claude 3 Surpasses GPT-4 with 1M Context Window, 50% Cheaper; OpenAI Launches 'Read Aloud'
Authors
48
4 months
AI
Tech
Anthropic's Claude 3 Beats GPT-4 by 0.4%, Supports 200K Tokens, Opus 2.5x Costlier
Authors
123
4 months
AI
Tech
Previous
Next