ALBERT

Cloned from Google ALBERT which only suitable for CPU, single GPU and TPU. We consider the favour of bert-multi-gpu and did slight modificartion to support Multi-GPU fine-tuning on AWS P3.16xlarge.

Modification Details

reset the the Estimator and EstimatorSpec cause the oringinal one could ony suitable for single training device.
Adapt MirroredStrategy into training progress as here. Notice: the input data is batched by the global batch size, whereas the batch size setting in the parameters of FLAGS are local batch size.
Transform the optimizer including AdamW and Lamb in custom_optimization.py
NVIDIA Collective Communications Library (NCCL) are required for reduce options as here

Data and Evalution scripts

SQuAD

GLUE

simply usepython3 download_glue_data.py to download ALL GLUE TASKS

Simply Fine-tuning

simply load and save the pre-trained model by running the bash file download_pretrained_models.sh
Use multi_run_albert_glue.sh to fine-tune ALBERT on GLUE and multi_run_albert_squad.sh to fine-tune on SQuAD

DON'T forget to set up file path inside bash file.

export TASK=CoLA
export ALBERT_DIR=base
export VERSION=2
export CURRENT_PWD=/home/ubuntu

export GLUE_DIR=${CURRENT_PWD}/glue_data
export OUTPUT_DIR=${CURRENT_PWD}/albert_output/${TASK}_${ALBERT_DIR}_v${VERSION}

export BS=8
export MSL=128
export LR=5e-06
export WPSP=320
export TSP=5336

pip3 install numpy
pip3 install -r requirements.txt

sudo CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \
    python3 -m albert.run_multigpus_classifier \
    --do_train=True \
    --do_eval=True \
    --strategy_type=mirror \
    --num_gpu_cores=2 \
    --data_dir=${GLUE_DIR} \
    --cached_dir=${CURRENT_PWD}/cached_albert_tfrecord \
    --task_name=${TASK} \
    --output_dir=${OUTPUT_DIR} \
    --max_seq_length=${MSL} \
    --train_step=${TSP} \
    --warmup_step=${WPSP} \
    --train_batch_size=${BS} \
    --learning_rate=${LR} \
    --albert_config_file=${CURRENT_PWD}/pretrained_model/albert_${ALBERT_DIR}_v${VERSION}/albert_config.json \
    --init_checkpoint=${CURRENT_PWD}/pretrained_model/albert_${ALBERT_DIR}_v${VERSION}/model.ckpt-best \
    --vocab_file=./30k-clean.vocab \
    --spm_model_file=./30k-clean.model \

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ALBERT

Data and Evalution scripts

Simply Fine-tuning

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 85 Commits
albert		albert
.gitignore		.gitignore
30k-clean.model		30k-clean.model
30k-clean.vocab		30k-clean.vocab
README.md		README.md
download_glue_data.py		download_glue_data.py
download_pretrained_models.sh		download_pretrained_models.sh
load_squad_features.py		load_squad_features.py
multi_run_albert_glue.sh		multi_run_albert_glue.sh
multi_run_albert_race.sh		multi_run_albert_race.sh
multi_run_albert_squad.sh		multi_run_albert_squad.sh
requirements.txt		requirements.txt
run_albert_glue.sh		run_albert_glue.sh
run_albert_race.sh		run_albert_race.sh
run_albert_squad.sh		run_albert_squad.sh
setup.py		setup.py

zheyuye/ALBERT

Folders and files

Latest commit

History

Repository files navigation

ALBERT

Data and Evalution scripts

Simply Fine-tuning

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages