# Mathematical Models of Computation in Superposition

Kaarel Hänni, Jake Mendel, Dmitry Vaintrob, Lawrence Chan

August, 2024

### Abstract

Superposition – when a neural network represents more “features” than it has dimensions – seems to pose a serious challenge to mechanistically interpreting current AI systems. Existing theory work studies *representational* superposition, where superposition is only used when passing information through bottlenecks. In this work, we present mathematical models of *computation* in superposition, where superposition is actively helpful for efficiently accomplishing the task. We first construct a task of efficiently emulating a circuit that takes the AND of the Õ(m^2) pairs of each of m features. We construct a 1-layer MLP that uses superposition to perform this task up to ε-error, where the network only requires Õ(m^(2/3)) neurons, even when the input features are *themselves in superposition*. We generalize this construction to arbitrary sparse boolean circuits of low depth, and then construct “error correction” layers that allow deep fully-connected networks of width d to emulate circuits of width Õ (d^1.5) and *any* polynomial depth. We conclude by providing some potential applications of our work for interpreting neural networks that implement computation in superposition.

Publication

Mech Interp Workshop @ ICML 2024

###### PhD Candidate

I do AI Alignment research.