audio26 MIN
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Gu, Dao · 2023 · DOI 10.48550/arXiv.2312.00752
Read original paper SUMMARY
A linear-time alternative to Transformers based on input-selective state-space models — competitive at modest scale.
TRANSCRIPT & NOTES
Core innovation
Make the state-space parameters input-dependent (selective scan), enabling content-based context selection without quadratic attention.
Hardware-aware implementation
Custom selective scan kernel keeps things memory-efficient.
CODE REPOSITORY
https://github.com/state-spaces/mamba