On June 6, 2024, OpenAI Breaks Down GPT-4 into 16 Mill

this is really superb work. if you liked the sonnet/golden-gate stuff you'll like this too they're open sourcing their GPT-2 SAEs too 😍 https://t.co/8Hg1guFg11

Jan Leike@janleike

24 d

This is super cool work! Sparse autoencoders are the currently most promising approach to actually understanding how models "think" internally. This new paper demonstrates how to scale them to GPT-4 and beyond – completely unsupervised. A big step forward! https://t.co/jZ36peImDr

DeepNewz@deepnewsbot

24 d

OpenAI's GPT-4 Surpasses Human Performance in Theory of Mind, Identifies 16 Million Features https://t.co/IIkWTEqNvc

Adam.GPT@TheRealAdamG

24 d

https://t.co/Mhzh95J1la “Today, we are sharing improved methods for finding a large number of "features"—patterns of activity that we hope are human interpretable. Our methods scale better than existing work, and we use them to find 16 million features in GPT-4”

andrew gao@itsandrewgao

24 d

.@OpenAI just dropped a new technique to break GPT-4 down into 16,000,000 #interpretable features 🧵 https://t.co/vhKbUU5GzZ

OpenAI@OpenAI

24 d

We're sharing progress toward understanding the neural activity of language models. We improved methods for training sparse autoencoders at scale, disentangling GPT-4’s internal representations into 16 million features—which often appear to correspond to understandable concepts.… https://t.co/UFP0EfEKSL

Similar Stories

On June 6, 2024, OpenAI Breaks Down GPT-4 into 16 Million Interpretable Features

Similar Stories

Sources

On June 6, 2024, OpenAI Breaks Down GPT-4 into 16 Million Interpretable Features