Paper reproductions, interpretability experiments, and interactive explorations of ML research in my domain.
This blog will feature reproductions of significant papers in mechanistic interpretability and related areas, along with my own mini-extensions and experiments. Each post will include an embedded interactive playground where you can run the models and experiments yourself in the browser.