Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
math_utils.py		math_utils.py
requirements.txt		requirements.txt
smollm2_base.txt		smollm2_base.txt
smollm2_instruct.txt		smollm2_instruct.txt
tasks.py		tasks.py

README.md

SmolLM evaluation scripts

We're using the LightEval library to benchmark our models.

Check out the quick tour to configure it to your own hardware and tasks.

Setup

Use conda/venv with python>=3.10.

Adjust the pytorch installation according to your environment:

pip install torch==2.4.0 --index-url https://download.pytorch.org/whl/cu121

For reproducibility, we recommend fixed versions of the libraries:

pip install -r requirements.txt

Running the evaluations

SmolLM2 base models

lighteval accelerate \
  --model_args "pretrained=HuggingFaceTB/SmolLM2-1.7B,revision=main,dtype=bfloat16,vllm,gpu_memory_utilisation=0.8,max_model_length=2048" \
  --custom_tasks "tasks.py" --tasks "smollm2_base.txt" --output_dir "./evals" --save_details

SmolLM2 instruction-tuned models

(note the --use_chat_template flag)

lighteval accelerate \
  --model_args "pretrained=HuggingFaceTB/SmolLM2-1.7B-Instruct,revision=main,dtype=bfloat16,vllm,gpu_memory_utilisation=0.8,max_model_length=2048" \
  --custom_tasks "tasks.py" --tasks "smollm2_instruct.txt" --use_chat_template --output_dir "./evals" --save_details

MATH and other extra tasks

lighteval accelerate \
  --model_args "pretrained=HuggingFaceTB/SmolLM2-1.7B-Instruct,revision=main,dtype=bfloat16,vllm,gpu_memory_utilisation=0.8,max_model_length=4096" \
  --custom_tasks "tasks.py" --tasks "custom|math|4|1" --use_chat_template --output_dir "./evals" --save_details

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

evaluation

evaluation

README.md

SmolLM evaluation scripts

Setup

Running the evaluations

SmolLM2 base models

SmolLM2 instruction-tuned models

MATH and other extra tasks

Files

evaluation

Directory actions

More options

Directory actions

More options

Latest commit

History

evaluation

Folders and files

parent directory

README.md

SmolLM evaluation scripts

Setup

Running the evaluations

SmolLM2 base models

SmolLM2 instruction-tuned models

MATH and other extra tasks