Distributed PyTorch Template

A concise and full-featured PyTorch project template for distributed training and evaluation. It shares similar structure with most of the popular modern frameworks like detectron2 and mmdetection but simplifies the complex designs and inherits the core componets. It also reduces the extra packages dependencies as much as possible. Hope this simple PyTorch template can help you get started on your project easily and built a scalable and high-performance deep learning project.

Installation

Requirements

Linux or macOS with Python ≥ 3.7
PyTorch ≥ 1.8 and torchvision that matches the PyTorch installation.
OpenCV

Preparation

It is highly recommended that you rename the root package directory (originally is distributed-pytorch-template/src) to your package name, since src is a general and meaningless name for a package. Changing it to a meaningful name makes your project more explict and reduces potential package name conflicts. After you rename the directory of the root package, it is recommended to change the package name in setup.py to the same name, so that pip can use this name for your package.
Setup your datasets as introduced in datasets/README.md. Set the environment variable DATASETS_ROOT pointing to the directory containing all datasets in command line or shell config file for permanent usage:

export ALL_DATASETS=/path/to/datasets_root

Build from source

In the root directory of the project, installing by:

python -m pip install -e .

Getting Started

Training

In tools, we provide a basic training script train_net.py. You can use it as a reference to write your own training scripts.
To train with tools/train_net.py, you can run:

cd tools/
python train_net.py --num-gpus 4 --config ../configs/llie_base.json

The config is made to train with 4 GPUs. You can change the number of GPUs by modifying the --num-gpus option.

To specify the GPU devices, you can setting the environment variable CUDA_VISIBLE_DEVICES:

CUDA_VISIBLE_DEVICES=0,2 python train_net.py --num-gpus 2 --config ../configs/llie_base.json

The above config will use the GPU devices with id 0 and 2 for training.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Distributed PyTorch Template

Installation

Requirements

Preparation

Build from source

Getting Started

Training

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
configs		configs
datasets		datasets
scripts		scripts
src		src
tests		tests
tools		tools
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

ZhaoChuyang/distributed-pytorch-template

Folders and files

Latest commit

History

Repository files navigation

Distributed PyTorch Template

Installation

Requirements

Preparation

Build from source

Getting Started

Training

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages