Skip to content

Latest commit

 

History

History
 
 

openelm

OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework

arXiv

We provide pretraining, evaluation, instruction tuning, and parameter-efficient finetuning instructions along with pretrained models and checkpoints:

  1. Pre-training
  2. Evaluation
  3. Instruction Tuning
  4. Parameter-Efficient Finetuning
  5. MLX Conversion
  6. HuggingFace

Tokenizer

In our experiments, we used LLamav1/v2 tokenizer. Please download the tokenizer from the official repository.

Bias, Risks, and Limitations

The release of OpenELM models aims to empower and enrich the open research community by providing access to state-of-the-art language models. Trained on publicly available datasets, these models are made available without any safety guarantees. Consequently, there exists the possibility of these models producing outputs that are inaccurate, harmful, biased, or objectionable in response to user prompts. Thus, it is imperative for users and developers to undertake thorough safety testing and implement appropriate filtering mechanisms tailored to their specific requirements.

Citation

If you find our work useful, please cite:

@article{mehta2024openelm,
  title={OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework},
  author = {Sachin Mehta and Mohammad Hossein Sekhavat and Qingqing Cao and Maxwell Horton and Yanzi Jin and Chenfan Sun and Iman Mirzadeh and Mahyar Najibi and Dmitry Belenko and Peter Zatloukal and Mohammad Rastegari},
  year={2024},
  eprint={2404.14619},
  archivePrefix={arXiv},
  primaryClass={cs.CL}
}

@inproceedings{mehta2022cvnets, 
     author = {Mehta, Sachin and Abdolhosseini, Farzad and Rastegari, Mohammad}, 
     title = {CVNets: High Performance Library for Computer Vision}, 
     year = {2022}, 
     booktitle = {Proceedings of the 30th ACM International Conference on Multimedia}, 
     series = {MM '22} 
}