Thanks for the comment! We always use the pre-ReLU feature activation, which is equal to the post-ReLU activation (given that the feature is activate), and is purely linear function of z. Edited the post for clarity.

Reply

SAE-VIS: Announcement Post

Connor Kissane2mo70

Amazing! We found your original library super useful for our Attention SAEs research, so thanks for making this!

Reply

Mech Interp Puzzle 1: Suspiciously Similar Embeddings in GPT-Neo

Connor Kissane9mo10

These puzzles are great, thanks for making them!

Reply

Causal scrubbing: results on induction heads

Connor Kissane10mo10

Code for this token filtering can be found in the appendix and the exact token list is linked.

Maybe I just missed it, but I'm not seeing this. Is the code still available?

Reply