Skip to content
/ SeqBoat Public

[NeurIPS 2023] Sparse Modular Activation for Efficient Sequence Modeling

License

Notifications You must be signed in to change notification settings

renll/SeqBoat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

5a34aed Â· Dec 2, 2023

History

19 Commits
Nov 9, 2023
Nov 6, 2023
Jul 18, 2023
Nov 26, 2023
Jul 18, 2023
Jul 18, 2023
Jul 18, 2023
Nov 26, 2023
Jul 18, 2023
Jul 18, 2023
Jul 18, 2023
Dec 2, 2023
Jul 18, 2023
Jul 18, 2023
Nov 26, 2023
Nov 26, 2023
Dec 2, 2023
Jul 18, 2023

Repository files navigation

Sparse Modular Activation for Efficient Sequence Modeling

1University of Illinois at Urbana-Champaign, 2Microsoft Azure Cognitive Services Research, * Work done at Microsoft internship and UIUC.

arXiv arXiv arXiv

Introduction

This is the PyTorch implementation of SeqBoat 🚤 proposed in our paper. This repository is based on MEGA and the fairseq package v0.9.0.

Updates

  • [Nov. 26] Added a standalone CIFAR-10 training script of SeqBoat for quickstart!
  • [Nov. 5] Released training scripts for enwik8 and added a standalone implementation of SeqBoat here!
  • [Sep. 21] Our paper is accepted by NeurIPS 2023!
  • [July 18] Released training scripts for LRA and Speech Commands.

Code Overview

  1. The compress and extract operators for Sparse Modular Activation (SMA) are implemented in fairseq/modules/seqboat_utils.py with the functions compress_seq and extract respectively.
  2. SeqBoat layer is implemented in fairseq/modules/seqboat_unit.py.

Setup

This repository requires Python 3.8+ and Pytorch 1.11+.

# Install from this repo
pip install -e .

For faster training, install NVIDIA's apex library following fairseq.

Quickstart

The easiest way to get started is to run the standalone_cifar.py script. This script trains a simple SeqBoat model on CIFAR-10:

python standalone_cifar.py --prenorm

Experiments

We also provide the training and testing scripts for each of the tasks in the experiment directory.

Citation

If you find our work useful, please consider citing:

@inproceedings{ren2023sparse,
  title={Sparse Modular Activation for Efficient Sequence Modeling},
  author={Liliang Ren and Yang Liu and Shuohang Wang and Yichong Xu and Chenguang Zhu and ChengXiang Zhai},
  booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
  year={2023},
  url={https://openreview.net/forum?id=TfbzX6I14i}
}

License

SeqBoat is under MIT license. The license also applies to model checkpoints.

Contact

Liliang Ren ([email protected])

About

[NeurIPS 2023] Sparse Modular Activation for Efficient Sequence Modeling

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published