Tags: djatkdgus789/torchgpipe
Tags
v0.0.5 Featured: @skippable for efficient skip connections. With this interface, GPipe copies skip tensors directly to the destination device. Improvements: - Checkpointing deterministically handles randomness managed by PyTorch. - balance_by_size() analyzes parameters as well. Breaking Changes: - Moved torchgpipe_balancing module to torchgpipe.balance. - Redesigned interface of balance_by_time() and balance_by_size().
v0.0.3 Released on September 30, 2019. Featured: torchgpipe now overlaps copy and computation using the separate CUDA streams. Previously, GPU could not compute a partition while copying micro-batches across different GPUs because they all happened on the same default CUDA stream. Other Improvements: - Added support for PyTorch 1.2. - Redesigned the internal pipeline parallelism to represent dependencies transparently. - Fixed the hanging issue when an exception is raised in a partition. - Fixed the unintended size accumulation (issue kakaobrain#3 by Shiyan Deng) of balance_by_size. Breaking Changes: - No more support for PyTorch 1.0. - Changed type of GPipe.devices from tuple to list. - Removed current_microbatch. This approach turned out to be incompatible with checkpointing.
v0.0.2 Released on June 26, 2019. - Added support for PyTorch 1.1. - Refined public APIs. - Detailed documentation. - Proper exceptions for invalid usage. - Provided automatic balancing. - Provided inspecting utilities: current_microbatch() and is_recomputing() - Reimplemented deferred batch normalization by subclassing.