This repository holds NVIDIA-maintained utilities to streamline mixed precision and distributed training in Pytorch. Some of the code here will be included in upstream Pytorch eventually. The intention of Apex is to make up-to-date utilities available to users as quickly as possible.
Full API Documentation: https://nvidia.github.io/apex
apex.amp
is a tool designed for ease of use and maximum safety in FP16 training. All potentially unsafe ops are performed in FP32 under the hood, while safe ops are performed using faster, Tensor Core-friendly FP16 math. amp
also automatically implements dynamic loss scaling.
The intention of amp
is to be the "on-ramp" to easy FP16 training: achieve all the numerical stability of full FP32 training, with most of the performance benefits of full FP16 training.
Python Source and API Documentation
apex.FP16_Optimizer
wraps an existing Python optimizer and automatically implements master parameters and static or dynamic loss scaling under the hood.
The intention of FP16_Optimizer
is to be the "highway" for FP16 training: achieve most of the numerically stability of full FP32 training, and almost all the performance benefits of full FP16 training.
Simple examples with FP16_Optimizer
word_language_model with FP16_Optimizer
The Imagenet and word_language_model directories also contain examples that show manual management of master parameters and static loss scaling.
These manual examples illustrate what sort of operations amp
and FP16_Optimizer
are performing automatically.
apex.parallel.DistributedDataParallel
is a module wrapper, similar to
torch.nn.parallel.DistributedDataParallel
. It enables convenient multiprocess distributed training,
optimized for NVIDIA's NCCL communication library.
apex.parallel.multiproc
is a launch utility that helps set up arguments for DistributedDataParallel.
The Imagenet with FP16_Optimizer
mixed precision examples also demonstrate apex.parallel.DistributedDataParallel
.
Python 3
CUDA 9
PyTorch 0.4 or newer. We recommend to use the latest stable release, obtainable from
https://pytorch.org/. We also test against the latest master branch, obtainable from https://github.com/pytorch/pytorch.
If you have any problems building, please file an issue.
To build the extension run the following command in the root directory of this project
python setup.py install
To use the extension
import apex
and optionally (if required for your use)
import apex_C as apex_backend
Windows support is experimental, and Linux is recommended. If you wish to install Apex in Windows, there are two requirements:
- Apex must be installed in the same Conda environment as Pytorch.
- Building Apex requires the same Visual Studio environment settings as building Pytorch from source:
cd apex_dir
set "VS150COMNTOOLS=C:\Program Files (x86)\Microsoft Visual Studio\2017\Enterprise\VC\Auxiliary\Build"
call "%VS150COMNTOOLS%\vcvarsall.bat" x64 -vcvars_ver=14.11
python setup.py install
You may need to replace 2017
, Enterprise
, or vcvars_ver
according to your version of Visual Studio.