Fast reference benchmarks for training ML models with recipes. Designed to be easily forked and modified.
Figure 1: Comparison of MosaicML recipes against other results, all measured on 8x A100s on MosaicML Cloud.
Train the MosaicML ResNet, the fastest ResNet50 implementation that yields a ✨ 7x ✨ faster time-to-train compared to a strong baseline. See our blog for more details and recipes. Our recipes were also demonstrated at MLPerf, a cross industry ML benchmark.
🚀 Get started with the code here.
Simple yet feature complete implementation of GPT-3, that scales to 175B parameters while maintaining similar GPU utilization as other approaches. Flexible code, written in vanilla pytorch, that uses PyTorch FSDP and some recent efficiency improvements.
🚀 Get started with the code here.