Skip to content

Commit

Permalink
add resume training
Browse files Browse the repository at this point in the history
  • Loading branch information
Keiku committed Jul 17, 2021
1 parent dc690ce commit 5df4cff
Showing 1 changed file with 13 additions and 0 deletions.
13 changes: 13 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,19 @@ export TORCH_HOME=/home/docker
python train.py +experiments=train_exp01 hydra.run.dir=outputs/train_exp01
```

### Resume Training

If you want to resume training, specify the following config.

```
train:
resume: True
checkpoint: "/mnt/nfs/kuroyanagi/clones/PyTorch-Lightning-CIFAR10/outputs/train_re\
sume_exp01/logs/resnet18/exp01/checkpoints/last.ckpt"
```

Even if you interrupt while using AWS spot instance, you can read `last.ckpt` and restart from the next epoch learning. You can use `run.sh` as a command when restarting.

### Test

Specify `evaluate: True` in config as shown below.
Expand Down

0 comments on commit 5df4cff

Please sign in to comment.