Anthropic Researchers Use 'Brain Scan' to Unveil Insig

AI is often described as a black box - nobody truly understands how it works. But a new "brain scan" developed by researchers at Anthropic could be a solution to that problem: https://t.co/qbZXg7NeJZ

Gary Marcus@GaryMarcus

1 mo

Hot take on a fascinating new paper on (partial) interpretability from @AnthropicAI: • The team was able to find (some) concept-like* “feature” representations for concepts ranging from the concrete to more abstract, from Golden Gate Bridge, to Secrecy, and Conflict of… https://t.co/I4NwxXcP5V

Kevin Roose@kevinroose

1 mo

Here's some actual good news in AI! Researchers at Anthropic have made progress toward figuring out what goes on inside LLMs, identifying millions of "features" in Claude 3 that activate when specific concepts such as San Francisco, lithium, or deception are discussed. This…

Alex Albert@alexalbert__

1 mo

Our new interpretability paper offers the first ever detailed look inside a frontier LLM and has amazing stories. I want to share two of them that have stuck with me ever since I read it. For background, the paper shows our latest work on interpreting the “features” of Claude 3… https://t.co/ZQcnpmB3HX

WIRED Business@WIREDBusiness

1 mo

What goes on in artificial neural networks work is largely a mystery, even to their creators. But researchers from Anthropic have caught a glimpse. https://t.co/KREv9IR266

Similar Stories

Anthropic Researchers Use 'Brain Scan' to Unveil Insights into Claude 3's Inner Workings

Similar Stories

Sources

Anthropic Researchers Use 'Brain Scan' to Unveil Insights into Claude 3's Inner Workings