Anthropic AI's Interpretability Release Introduces 'Go

The golden gate bridge is an extrovert and prefers when it's full of cars over it 🚗🚗🚗 https://t.co/4Bd24IEi22 https://t.co/IK2sjVKCbm

AshutoshShrivastava@ai_for_success

1 mo

For a limited time, you can chat with Golden Gate Claude 😉 If you dunno what is Golden gate, more details in 🧵 1/n Click on golden gate icon on top right. https://t.co/cuEfheovur

Jack Clark@jackclarkSF

1 mo

One of the most amazing parts of the recent Interpretability release has been how we can use 'feature clamping' to change how models behave. For an example, play around with 'Golden Gate Claude' - check out how it responds to my question about what to build to solve AI policy https://t.co/gcRneTTgTs https://t.co/oCR18hhYRS

Jonathan Whitaker@johnowhitaker

1 mo

They actually did it! 😂😀 https://t.co/wDMFRgGuHd

TDM (e/λ)@cto_junior

1 mo

lmfaoooo https://t.co/6kvos3W8br https://t.co/ATFYHJgN7d

Tanishq Mathew Abraham, Ph.D.@iScienceLuvr

1 mo

You can talk to Golden Gate Bridge Claude now! Available at https://t.co/KbsmiPpe3G https://t.co/5NW9rbAs3m

kipply@kipperrii

1 mo

deep down inside, golden gate claude realises that he's been hacked to think about the golden gate bridge! I never told him it was a landmark. I think he likes it? https://t.co/MAKc6JtUBt

sam mcallister@sammcallister

1 mo

you have got to sweat the details https://t.co/eq0ey4VDo1 https://t.co/fJzJBvXogB

kipply@kipperrii

1 mo

golden gate claude lacks self-awareness to realise that his talking about the golden gate bridge is really weird https://t.co/puemSuoIvu

Spencer Schiff@SpencerKSchiff

1 mo

Holy shit they actually did it I love Anthropic now https://t.co/X21vrx0I4O

kipply@kipperrii

1 mo

golden gate claude is not self-aware! https://t.co/DormkTIhPd

Anthropic@AnthropicAI

1 mo

This week, we showed how altering internal "features" in our AI, Claude, could change its behavior. We found a feature that can make Claude focus intensely on the Golden Gate Bridge. Now, for a limited time, you can chat with Golden Gate Claude: https://t.co/uLbS2JNczH https://t.co/WHmoi2AmoR

Robby Starbuck@robbystarbuck

1 mo

Migrants are literally tired of waiting to be arrested so they’re calling @lyft to help them break into the country. What an embarrassment. We effectively don’t even have a border anymore. https://t.co/Fqc4QDCJ8J