Sparse Modular Activation for Efficient Sequence Modeling

Liliang Ren^1*, Yang Liu², Shuohang Wang², Yichong Xu², Chenguang Zhu², ChengXiang Zhai¹

¹University of Illinois at Urbana-Champaign, ²Microsoft Azure Cognitive Services Research, ^* Work done at Microsoft internship and UIUC.

Introduction

This is the PyTorch implementation of SeqBoat 🚤 proposed in our paper. This repository is based on MEGA and the fairseq package v0.9.0.

Updates

[Nov. 26] Added a standalone CIFAR-10 training script of SeqBoat for quickstart!
[Nov. 5] Released training scripts for enwik8 and added a standalone implementation of SeqBoat here!
[Sep. 21] Our paper is accepted by NeurIPS 2023!
[July 18] Released training scripts for LRA and Speech Commands.

Code Overview

The compress and extract operators for Sparse Modular Activation (SMA) are implemented in fairseq/modules/seqboat_utils.py with the functions compress_seq and extract respectively.
SeqBoat layer is implemented in fairseq/modules/seqboat_unit.py.

Setup

This repository requires Python 3.8+ and Pytorch 1.11+.

# Install from this repo
pip install -e .

For faster training, install NVIDIA's apex library following fairseq.

Quickstart

The easiest way to get started is to run the standalone_cifar.py script. This script trains a simple SeqBoat model on CIFAR-10:

python standalone_cifar.py --prenorm

Experiments

We also provide the training and testing scripts for each of the tasks in the experiment directory.

Citation

If you find our work useful, please consider citing:

@inproceedings{ren2023sparse,
  title={Sparse Modular Activation for Efficient Sequence Modeling},
  author={Liliang Ren and Yang Liu and Shuohang Wang and Yichong Xu and Chenguang Zhu and ChengXiang Zhai},
  booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
  year={2023},
  url={https://openreview.net/forum?id=TfbzX6I14i}
}

License

SeqBoat is under MIT license. The license also applies to model checkpoints.

Contact

Liliang Ren ([email protected])

Name	Name	Last commit message	Last commit date
Latest commit renll Update a comment for distributed training with NCCL Dec 2, 2023 5a34aed · Dec 2, 2023 History 19 Commits
docs	docs	Add files via upload	Nov 9, 2023
examples	examples	Update README.lra.md	Nov 6, 2023
experiment	experiment	init	Jul 18, 2023
fairseq	fairseq	Add standalone_cifar	Nov 26, 2023
fairseq_cli	fairseq_cli	init	Jul 18, 2023
scripts	scripts	init	Jul 18, 2023
tests	tests	init	Jul 18, 2023
.gitignore	.gitignore	Add standalone_cifar	Nov 26, 2023
CODE_OF_CONDUCT.md	CODE_OF_CONDUCT.md	init	Jul 18, 2023
CONTRIBUTING.md	CONTRIBUTING.md	init	Jul 18, 2023
LICENSE	LICENSE	init	Jul 18, 2023
README.md	README.md	Update README.md	Dec 2, 2023
hubconf.py	hubconf.py	init	Jul 18, 2023
pyproject.toml	pyproject.toml	init	Jul 18, 2023
setup.py	setup.py	Add standalone_cifar	Nov 26, 2023
standalone_cifar.py	standalone_cifar.py	Add standalone_cifar	Nov 26, 2023
standalone_seqboat.py	standalone_seqboat.py	Update a comment for distributed training with NCCL	Dec 2, 2023
train.py	train.py	init	Jul 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sparse Modular Activation for Efficient Sequence Modeling

Introduction

Updates

Code Overview

Setup

Quickstart

Experiments

Citation

License

Contact

About

Releases

Packages

Languages

License

renll/SeqBoat

Folders and files

Latest commit

History

Repository files navigation

Sparse Modular Activation for Efficient Sequence Modeling

Introduction

Updates

Code Overview

Setup

Quickstart

Experiments

Citation

License

Contact

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages