vedaseg is an open source semantic segmentation toolbox based on PyTorch.
-
Modular Design
We decompose the semantic segmentation framework into different components. The flexible and extensible design make it easy to implement a customized semantic segmentation project by combining different modules like building Lego.
-
Support of several popular frameworks
The toolbox supports several popular and semantic segmentation frameworks out of box, e.g. DeepLabv3+, DeepLabv3, U-Net, PSPNet, FPN, etc.
This project is released under the Apache 2.0 license.
Note: All models are trained only on PASCAL VOC 2012 trainaug dataset and evaluated on PASCAL VOC 2012 val dataset.
Architecture | backbone | OS | MS & Flip | mIOU |
---|---|---|---|---|
DeepLabv3plus | ResNet-101 | 16 | True | 79.80% |
DeepLabv3plus | ResNet-101 | 16 | False | 78.19% |
DeepLabv3 | ResNet-101 | 16 | True | 78.94% |
DeepLabv3 | ResNet101 | 16 | False | 77.07% |
FPN | ResNet-101 | 2 | True | 75.42% |
FPN | ResNet-101 | 2 | False | 73.65% |
PSPNet | ResNet-101 | 8 | True | 74.68% |
PSPNet | ResNet-101 | 8 | False | 73.71% |
U-Net | ResNet-101 | 1 | True | 73.09% |
U-Net | ResNet-101 | 1 | False | 70.98% |
OS: Output stride used during evaluation
MS: Multi-scale inputs during evaluation
Flip: Adding left-right flipped inputs during evaluation
Models above are available in the GoogleDrive.
- Linux
- Python 3.7+
- PyTorch 1.1.0 or higher
- CUDA 9.0 or higher
We have tested the following versions of OS and softwares:
- OS: Ubuntu 16.04.6 LTS
- CUDA: 9.0
- Python 3.7.3
a. Create a conda virtual environment and activate it.
conda create -n vedaseg python=3.7 -y
conda activate vedaseg
b. Install PyTorch and torchvision following the official instructions, e.g.,
conda install pytorch torchvision -c pytorch
c. Clone the vedaseg repository.
git clone https://github.com/Media-Smart/vedaseg.git
cd vedaseg
vedaseg_root=${PWD}
d. Install dependencies.
pip install -r requirements.txt
Download Pascal VOC 2012 and Pascal VOC 2012 augmented, resulting in 10,582 training images(trainaug), 1,449 validatation images.
cd ${vedaseg_root}
mkdir ${vedaseg_root}/data
cd ${vedaseg_root}/data
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
wget http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/semantic_contours/benchmark.tgz
tar xf VOCtrainval_11-May-2012.tar
tar xf benchmark.tgz
python ../tools/encode_voc12_aug.py
python ../tools/encode_voc12.py
mkdir VOCdevkit/VOC2012/EncodeSegmentationClass
#cp benchmark_RELEASE/dataset/encode_cls/* VOCdevkit/VOC2012/EncodeSegmentationClass
(cd benchmark_RELEASE/dataset/encode_cls; cp * ${vedaseg_root}/data/VOCdevkit/VOC2012/EncodeSegmentationClass)
#cp VOCdevkit/VOC2012/EncodeSegmentationClassPart/* VOCdevkit/VOC2012/EncodeSegmentationClass
(cd VOCdevkit/VOC2012/EncodeSegmentationClassPart; cp * ${vedaseg_root}/data/VOCdevkit/VOC2012/EncodeSegmentationClass)
comm -23 <(cat benchmark_RELEASE/dataset/{train,val}.txt VOCdevkit/VOC2012/ImageSets/Segmentation/train.txt | sort -u) <(cat VOCdevkit/VOC2012/ImageSets/Segmentation/val.txt | sort -u) > VOCdevkit/VOC2012/ImageSets/Segmentation/trainaug.txt
To avoid tedious operations, you could save the above linux commands as a shell file and execute it.
a. Config
Modify some configuration accordingly in the config file like configs/deeplabv3plus.py
b. Run
python tools/trainval.py configs/deeplabv3plus.py
Snapshots and logs will be generated at ${vedaseg_root}/workdir
.
a. Config
Modify some configuration accordingly in the config file like configs/deeplabv3plus.py
b. Run
python tools/test.py configs/deeplabv3plus.py path_to_deeplabv3plus_weights
This repository is currently maintained by Hongxiang Cai (@hxcai), Yichao Xiong (@mileistone).
We got a lot of code from mmcv and mmdetection, thanks to open-mmlab.