Skip to content

spaparaju/llm-foundry

Repository files navigation

Examples

Fast reference examples for training ML models with recipes. Designed to be easily forked and modified.

ResNet-50 + ImageNet

drawing

Figure 1: Comparison of MosaicML recipes against other results, all measured on 8x A100s on MosaicML Cloud.

Train the MosaicML ResNet, the fastest ResNet50 implementation that yields a ✨ 7x ✨ faster time-to-train compared to a strong baseline. See our blog for more details and recipes. Our recipes were also demonstrated at MLPerf, a cross industry ML benchmark.

🚀 Get started with the code here.

DeepLabV3 + ADE20k

drawing

Train the MosaicML DeepLabV3 that yields a ✨5x✨ faster time-to-train compared to a strong baseline. See our blog for more details and recipes.

🚀 Get started with the code here.

Large Language Models (LLMs)

Training curves for various LLM sizes.

A simple yet feature complete implementation of GPT, that scales to 70B parameters while maintaining high performance on GPU clusters. Flexible code, written with vanilla PyTorch, that uses PyTorch FSDP and some recent efficiency improvements.

🚀 Get started with the code here.

About

LLM training code for MosaicML foundation models

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 97.2%
  • Shell 2.7%
  • Other 0.1%