Skip to content
forked from clab/dynet

DyNet: The Dynamic Neural Network Toolkit

License

Notifications You must be signed in to change notification settings

gulang2019/dynet-dao

 
 

Repository files navigation

DyNet-DAO

This repo implements dynamic activation offloading on-top-of DyNet.

Build

Preparation

# load CUDA 11.1;
cd ~ 
source cuda.sh 11.1 

Install Eigen

mkdir eigen
cd eigen
wget https://github.com/clab/dynet/releases/download/2.1/eigen-b2e267dc99d4.zip
unzip eigen-b2e267dc99d4.zip

Build

git clone https://github.com/clab/dynet.git
cd dynet
mkdir build
cd build
# Run CMake
# -DENABLE_BOOST=ON in combination with -DENABLE_CPP_EXAMPLES=ON also
# compiles the multiprocessing c++ examples
cmake .. -DEIGEN3_INCLUDE_DIR=/path/to/eigen -DENABLE_BOOST=ON -DENABLE_CPP_EXAMPLES=ON -DBACKEND=cuda -DCMAKE_EXPORT_COMPILE_COMMANDS=ON -DUSE_DAO=ON
# Compile using 2 processes
make -j 2
# Test with an example
./examples/xor

Run mnist

# prepare datasets 
ln -s /ssd1/siyuanch/workspace/dynet-dao/datasets datasets 
cd build/examples 
./mnist -t ../../datasets/mnist/train-images.idx3-ubyte -d ../../datasets/mnist/t10k-images.idx3-ubyte -tl ../../datasets/mnist/train-labels.idx1-ubyte -dl ../../datasets/mnist/t10k-labels.idx1-ubyte --batch_size 128 -N 20

Transformer Example

# cd <repo dir>
./build/examples/transformer-train -c models/iwslt-envi/config.txt --parameters models/iwslt-envi/en-vi.transformer.h2_l2_u128_do010101010001_att1_ls00_pe1_ml150_ffrelu &>models/iwslt-envi/log.en-vi.transformer.h2_l2_u128_do010101010001_att1_ls00_pe1_ml150_ffrelu

Example: fine-tune gpt2 with lora and skip rate 0.2

skip_r=0.2
# cd <repo dir>
mkdir -p models/gpt2-124M
cp /home/siyuanch/ssd/workspace_zelongg/dynet-dao/models/gpt2-124M/hparams.ini models/gpt2-124M
# TODO: modify hparams.ini for epochs, bs and log frequency
mkdir -p models/gpt2-124M-$skip_r  # prepare initial checkpoint
echo "768 12 12 4 0 0.1 $skip_r 0 0 0.1 1 1024 1 1 0 models/gpt2-124M-$skip_r/model.params" > models/gpt2-124M-$skip_r/model.config
cp /ssd1/siyuanch/workspace_zelongg/DAO/models/124M/dynet-model.params models/gpt2-124M-$skip_r/model.params
# Add --train-percent 10 to below cmd for faster run
./build/examples/transformer-lm -c models/gpt2-124M/hparams.ini --model-path models/gpt2-124M-$skip_r --attn-lora-r 2 --attention-dropout-p $skip_r --ff-dropout-p 0 --reset-if-stuck --use-smaller-minibatch 2>&1 | tee models/gpt2-124M-$skip_r/train.log

# Run transformer with dao 
./build/examples/transformer-lm --train-percent 3 --use_offload --dao-gpu-mem 16384  --dao-verbose 0 -c models/gpt2-124M/hparams.ini --attn-lora-r 2 --attention-dropout-p 0.2 --ff-dropout-p 0.2 --reset-if-stuck --use-smaller-minibatch --dynet-seed 1 2>&1 | tee models/gpt2-124M-0.2/train.log

DAO command line arguments

  • --use_offload: enable DAO's offloading backend; otherwise, we fallback to dynet's backend;
  • --dao-gpu-mem [int]: the gpu memory size for DAO's backend in MB;
  • --dao-cpu-mem [int]: the cpu memory size for DAO's backend in MB;
  • --dao-verbose [int=0]: the verbose level of DAO, default 0;
  • --dao-debug: enable a lot of assertions of DAO;
  • --dynet-seed [int=0]: the random seed; default = 0, meaning random;
  • --dao-profile 1: enable tracing of kernels; In your c++ application code, #include <DAO/DAO.h> and use DAO::profiler.dump(std::string name) method to dump the traces into a "name.traces" file.

GPT2 scripts;

We have a script to generate script for runnning gpt2; use python examples/gpt2/gen_script.py --help to look at the usage.

python examples/gpt2/gen_script.py --name gpt2-124M -c models/gpt2-124M/hparams.ini --gpu-mem 3 --attn-lora-r 4 --attention-dropout-p 0.0 0.4 0.8 --ff-dropout-p 0.0 0.4 0.8 --update-freq 8 --bs 2048 --script-name run_linear

DAO API

We use Engine to train the model by delaying the forward/backward/update. An usage The API can be seen in the header, an example usage can be seen at here.

Also, we add a feature to dynet to set if a parameter is trainable. dynet::ParameterCollection::set_default_updated(bool trainable) is used to set if the parameters is by default trainable or not. To specify if a parameter is trainable or not, one can use this api to add parameters into the collection:

/**
   * \brief Add parameters with custom initializer
   *
   * \param d Shape of the parameter
   * \param init Custom initializer
   * \param name Name of the parameter
   * \param device Device placement for the parameter
   * \param trainable Whether the parameter is trainable or not
   *
   * \return Parameter object to be used in the computation graph
   */
  Parameter ParameterCollection::add_parameters(const Dim& d, const ParameterInit & init,
                           const std::string & name, Device *device, bool trainable);

About

DyNet: The Dynamic Neural Network Toolkit

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 59.7%
  • C++ 22.9%
  • Shell 6.8%
  • Cython 3.4%
  • Rust 1.9%
  • C 1.4%
  • Other 3.9%