GitHub - TanJingV/mamba-minimal-chinese: Simple, minimal implementation of the Mamba SSM in one file of PyTorch.

mamba-minimal

在一个 PyTorch 文件中简单、最小化地实现 Mamba。

特点

正向和反向传递的数值输出与官方实现相同
简化、可读、带注释的代码

不包括

速度。官方实现进行了大量优化，这些优化是 Mamba 论文的核心贡献。为了提高可读性，我对大部分实现进行了简化。
适当的参数初始化（当然也可以在不影响可读性的前提下进行添加）

Demo

See demo.ipynb for examples of prompt completions.

from model import Mamba
from transformers import AutoTokenizer

model = Mamba.from_pretrained('state-spaces/mamba-370m')
tokenizer = AutoTokenizer.from_pretrained('EleutherAI/gpt-neox-20b')

generate(model, tokenizer, 'Mamba is the')

Mamba is the world's longest venomous snake with an estimated length of over 150 m. With such a large size and a venomous bite, Mamba kills by stabbing the victim (which is more painful and less effective than a single stab of the bite)

测试mamba_minmal.ipynbdemo可以直接在colab上面运行

References

The Mamba architecture was introduced in Mamba: Linear-Time Sequence Modeling with Selective State Spaces by Albert Gu and Tri Dao.

The official implementation is here: https://github.com/state-spaces/mamba/tree/main

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
LICENSE		LICENSE
README.md		README.md
demo.ipynb		demo.ipynb
mamba_minmal.ipynb		mamba_minmal.ipynb
model.py		model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mamba-minimal

Demo

References

About

Releases

Packages

Languages

License

TanJingV/mamba-minimal-chinese

Folders and files

Latest commit

History

Repository files navigation

mamba-minimal

Demo

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages