Blocked Matrix Formulation of Linear Attention Mechanisms

The blocked matrix formulation of linear attention mechanisms, multi-step online gradient descent at inference time, and chunk-wise parallelism.

March 16, 2025 · Franz Louis Cesista
Cover

(Linear) Attention as Test-Time Regression

A unifying framework for linear attention mechanisms as test-time regression and how to parallelize training and inference.

January 27, 2025 · Franz Louis Cesista