Skip to content

Code for "Structured Sparsity Inducing Adaptive Optimizers for Deep Learning" in PyTorch

License

Notifications You must be signed in to change notification settings

tristandeleu/pytorch-structured-sparsity

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Structured Sparsity Inducing Adaptive Optimizers for Deep Learning

This is the repository for the paper

Tristan Deleu, Yoshua Bengio, Structured Sparsity Inducing Adaptive Optimizers for Deep Learning [ArXiv]

This repository contains:

  • The weighted and unweighted proximal operators for the l1/l2 and group MCP penalties
  • A modification of AdamW from Hugging Face's transformers library to include a proximal step, compatible with the structured sparsity inducing penalties in this repository.
  • The definition of the groups (channel-wise & row-wise) for some Deep Learning architectures (VGG, Resnet, BERT).

About

Code for "Structured Sparsity Inducing Adaptive Optimizers for Deep Learning" in PyTorch

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages