Skip to content

Commit

Permalink
Load weights to cpu with PretrainedModelInitializer (allenai#4712)
Browse files Browse the repository at this point in the history
* load weights to cpu with PretrainedModelInitializer

* changelog
  • Loading branch information
eladsegal authored Oct 7, 2020
1 parent 327188b commit 40bb47a
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 1 deletion.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
and they both now return the results in bytes as integers. Also, the `peak_gpu_memory` function now utilizes PyTorch functions to find the memory
usage instead of shelling out to the `nvidia-smi` command. This is more efficient and also more accurate because it only takes
into account the tensor allocations of the current PyTorch process.
- Make sure weights are first loaded to the cpu when using PretrainedModelInitializer, preventing wasted GPU memory.

### Removed

Expand Down
2 changes: 1 addition & 1 deletion allennlp/nn/initializers.py
Original file line number Diff line number Diff line change
Expand Up @@ -384,7 +384,7 @@ class PretrainedModelInitializer(Initializer):
def __init__(
self, weights_file_path: str, parameter_name_overrides: Dict[str, str] = None
) -> None:
self.weights: Dict[str, torch.Tensor] = torch.load(weights_file_path)
self.weights: Dict[str, torch.Tensor] = torch.load(weights_file_path, map_location="cpu")
self.parameter_name_overrides = parameter_name_overrides or {}

@overrides
Expand Down

0 comments on commit 40bb47a

Please sign in to comment.