OpenELM Parameter-Efficient Finetuning (PEFT)

We fine-tune models using the evaluation setup described in LLM Adapters. This involves jointly fine-tuning on 8 commonsense reasoning datasets with a training set of size 170k. We follow the evaluation setup of the official code release of LLM Adapters, with the exception that we use log-likelihood rather than regex parsing to determine the model's output.

Setup

To ensure consistency of evaluations with LLM Adapters, we use helper functions defined in their code. To set up for evaluations, run the following command:

# Change this to the path to CoreNet
cd /path/to/corenet

# Install LM Harness.
git clone https://github.com/EleutherAI/lm-evaluation-harness.git
cd lm-evaluation-harness
git checkout 3196e907fa195b684470a913c7235ed7f08a4383
pip install -e .
cd ..

# Install LLM Adapters.
git clone https://github.com/AGI-Edgerunners/LLM-Adapters.git
cd LLM-Adapters
git checkout 816657208af4db747803f87ba40a4c71383fed7a
touch __init__.py
pip install -r requirements.txt -c ../internal/constraints.txt
cd ..

# Install Huggingface and its dependencies.
python3 -m pip install --upgrade transformers==4.36.2
python3 -m pip install --upgrade datasets==2.19.0
python3 -m pip install --upgrade accelerate==0.29.3
python3 -m pip install --upgrade sentencepiece==0.2.0

In our experiments, we used LLamav1/v2 tokenizer. Please download the tokenizer from the official repository.

Training

To fine-tune a 270M-parameter model with LoRA, use the following command:

CFG_FILE="projects/openelm/peft_configs/openelm_lora_270M.yaml"
WTS_FILE="https://docs-assets.developer.apple.com/ml-research/models/corenet/v0.1.0/openelm/pretrained/270M/checkpoint_average.pt"
TOKENIZER_FILE="<PATH_TO_TOKENIZER_FILE>"
# NOTE: The dataset can currently be obtained from https://github.com/AGI-Edgerunners/LLM-Adapters/blob/main/ft-training_set/commonsense_170k.json.
DATASET_FILE="<PATH_TO_COMMONSENSE_170K>"
corenet-train --common.config-file $CFG_FILE \
    --model.language-modeling.pretrained $WTS_FILE \
    --text-tokenizer.sentence-piece.model-path $TOKENIZER_FILE \
    --dataset.language-modeling.commonsense-170k.path $DATASET_FILE

To train with DoRA instead, edit the config file to set use_dora to True.

Evaluation

To evaluate a pre-trained LoRA 270M model, use the following command:

CFG_FILE="projects/openelm/peft_configs/openelm_lora_270M_eval.yaml"
WTS_FILE="https://docs-assets.developer.apple.com/ml-research/models/corenet/v0.1.0/openelm/peft/openelm_lora_270M.pt"
TOKENIZER_FILE="<PATH_TO_TOKENIZER_FILE>"
corenet-eval-llmadapters --common.config-file $CFG_FILE \
    --model.language-modeling.pretrained $WTS_FILE \
    --text-tokenizer.sentence-piece.model-path $TOKENIZER_FILE

The expected results are:

boolq	piqa	siqa	hellaswag	winogrande	arc-easy	arc-challenge	obqa
62.14	50.05	42.02	24.84	49.88	26.60	24.57	28.00

To evaluate other pretrained models, edit the config file to use different backbones. To evaluate DoRA models, edit the config file to set use_dora to True.

Pretraining checkpoints

Model	LoRA/DoRA	Weights
OpenELM-270M	LoRA	Link
OpenELM-450M	LoRA	Link
OpenELM-1.1B	LoRA	Link
OpenELM-3B	LoRA	Link
OpenELM-270M	DoRA	Link
OpenELM-450M	DoRA	Link
OpenELM-1.1B	DoRA	Link
OpenELM-3B	DoRA	Link

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README-peft.md

README-peft.md

OpenELM Parameter-Efficient Finetuning (PEFT)

Setup

Training

Evaluation

Pretraining checkpoints

Files

README-peft.md

Latest commit

History

README-peft.md

File metadata and controls

OpenELM Parameter-Efficient Finetuning (PEFT)

Setup

Training

Evaluation

Pretraining checkpoints