New Mechanistic Interpretability Libraries Prisma and

Pyvene is awesome! Expect to see some cool vision-language interpretability work leveraging it soon 😉 https://t.co/Y4j7ndUaZW

Neel Nanda@NeelNanda5

4 mo

Great work from @soniajoseph_! Frontier models are multimodal, and it's increasingly clear that mechanistic interpretability can't only study language models. Good tooling is unglamorous to work on, but essential for good research. I'm excited to see what work Prisma enables! https://t.co/bhpw1AuDao

Aryaman Arora @ NAACL@aryaman2020

4 mo

Pyvene has been really useful for easily running intervention experiments in my workflow! In particular, it's super easy to add support for new architectures compared to other interp libraries. Come try it out! https://t.co/Y5ac1vJMPB https://t.co/fXvI5Uh84h

Zhengxuan Wu@ZhengxuanZenWu

4 mo

New paper and library! 🫡 Intervening on internal states has emerged as a fundamental operation for analyzing and improving neural models. We release pyvene, a library for performing interventions and sharing intervened models. 👉Code & Paper: https://t.co/wV5L9NExft https://t.co/hq8RSfWLwE

Sonia Joseph@soniajoseph_

4 mo

I'm excited to release Prisma, a mechanistic interpretability library for multimodal models like CLIP and ViTs. Incubated at @tyrell_turing's lab & in collab with @NeelNanda5. Recent mech interp work has focused on language, but many techniques transfer. Behold, the dogit lens: https://t.co/gs2wCFIGAa

Similar Stories

New Mechanistic Interpretability Libraries Prisma and Pyvene Released for Multimodal Models and Neural Interventions with Collaboration from Tyrell Turing and Neel Nanda

Similar Stories

Sources

New Mechanistic Interpretability Libraries Prisma and Pyvene Released for Multimodal Models and Neural Interventions with Collaboration from Tyrell Turing and Neel Nanda