We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Force allreduce of all gradients in step(), bugfixes
FP16 hierarchical allreduce
PyTorch 1.0, TF-Keras, FP16 ops on GPU
Parallelized hierarchical allreduce
Support for the upcoming PyTorch release
Add compatibility with PyTorch 0.4.1
Support for IBM PowerAI DDL & APIs to restore optimizer state
Critical Bugfix: PyTorch must wait for GPU data before allreduce
Critical Bugfix: non-fused allreduce produces incorrect results
Hierarchical allreduce & differentiable ops