Skip to content

Latest commit

 

History

History
85 lines (67 loc) · 4.5 KB

README-peft.md

File metadata and controls

85 lines (67 loc) · 4.5 KB

OpenELM Parameter-Efficient Finetuning (PEFT)

We fine-tune models using the evaluation setup described in LLM Adapters. This involves jointly fine-tuning on 8 commonsense reasoning datasets with a training set of size 170k. We follow the evaluation setup of the official code release of LLM Adapters, with the exception that we use log-likelihood rather than regex parsing to determine the model's output.

Setup

To ensure consistency of evaluations with LLM Adapters, we use helper functions defined in their code. To set up for evaluations, run the following command:

# Change this to the path to CoreNet
cd /path/to/corenet

# Install LM Harness.
git clone https://github.com/EleutherAI/lm-evaluation-harness.git
cd lm-evaluation-harness
git checkout 3196e907fa195b684470a913c7235ed7f08a4383
pip install -e .
cd ..

# Install LLM Adapters.
git clone https://github.com/AGI-Edgerunners/LLM-Adapters.git
cd LLM-Adapters
git checkout 816657208af4db747803f87ba40a4c71383fed7a
touch __init__.py
pip install -r requirements.txt -c ../internal/constraints.txt
cd ..

# Install Huggingface and its dependencies.
python3 -m pip install --upgrade transformers==4.36.2
python3 -m pip install --upgrade datasets==2.19.0
python3 -m pip install --upgrade accelerate==0.29.3
python3 -m pip install --upgrade sentencepiece==0.2.0

In our experiments, we used LLamav1/v2 tokenizer. Please download the tokenizer from the official repository.

Training

To fine-tune a 270M-parameter model with LoRA, use the following command:

CFG_FILE="projects/openelm/peft_configs/openelm_lora_270M.yaml"
WTS_FILE="https://docs-assets.developer.apple.com/ml-research/models/corenet/v0.1.0/openelm/pretrained/270M/checkpoint_average.pt"
TOKENIZER_FILE="<PATH_TO_TOKENIZER_FILE>"
# NOTE: The dataset can currently be obtained from https://github.com/AGI-Edgerunners/LLM-Adapters/blob/main/ft-training_set/commonsense_170k.json.
DATASET_FILE="<PATH_TO_COMMONSENSE_170K>"
corenet-train --common.config-file $CFG_FILE \
    --model.language-modeling.pretrained $WTS_FILE \
    --text-tokenizer.sentence-piece.model-path $TOKENIZER_FILE \
    --dataset.language-modeling.commonsense-170k.path $DATASET_FILE

To train with DoRA instead, edit the config file to set use_dora to True.

Evaluation

To evaluate a pre-trained LoRA 270M model, use the following command:

CFG_FILE="projects/openelm/peft_configs/openelm_lora_270M_eval.yaml"
WTS_FILE="https://docs-assets.developer.apple.com/ml-research/models/corenet/v0.1.0/openelm/peft/openelm_lora_270M.pt"
TOKENIZER_FILE="<PATH_TO_TOKENIZER_FILE>"
corenet-eval-llmadapters --common.config-file $CFG_FILE \
    --model.language-modeling.pretrained $WTS_FILE \
    --text-tokenizer.sentence-piece.model-path $TOKENIZER_FILE

The expected results are:

boolq piqa siqa hellaswag winogrande arc-easy arc-challenge obqa
62.14 50.05 42.02 24.84 49.88 26.60 24.57 28.00

To evaluate other pretrained models, edit the config file to use different backbones. To evaluate DoRA models, edit the config file to set use_dora to True.

Pretraining checkpoints

Model LoRA/DoRA Weights
OpenELM-270M LoRA Link
OpenELM-450M LoRA Link
OpenELM-1.1B LoRA Link
OpenELM-3B LoRA Link
OpenELM-270M DoRA Link
OpenELM-450M DoRA Link
OpenELM-1.1B DoRA Link
OpenELM-3B DoRA Link