Skip to content

Commit

Permalink
Merge branch 'main' into list-img-size
Browse files Browse the repository at this point in the history
  • Loading branch information
shensheng272 authored Sep 13, 2022
2 parents 63d3119 + a384f21 commit a4186a8
Show file tree
Hide file tree
Showing 105 changed files with 13,564 additions and 1,016 deletions.
13 changes: 13 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,9 @@ __pycache__/
# C extensions

# Distribution / packaging

.Python
videos/
build/
runs/
weights/
Expand Down Expand Up @@ -102,3 +104,14 @@ venv.bak/

# Pytorch
*.pth

#vscode
.vscode/*

#user scripts
*.sh

# model files
*.onnx
*.pt
*.engine
167 changes: 137 additions & 30 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,24 +1,22 @@
# MT-YOLOv6 [About Naming YOLOv6](./docs/About_naming_yolov6.md)

# YOLOv6
Implementation of paper - [YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications](https://arxiv.org/abs/2209.02976)
## Introduction

YOLOv6 is a single-stage object detection framework dedicated to industrial applications, with hardware-friendly efficient design and high performance.

<img src="assets/picture.png" width="800">

YOLOv6-nano achieves 35.0 mAP on COCO val2017 dataset with 1242 FPS on T4 using TensorRT FP16 for bs32 inference, and YOLOv6-s achieves 43.1 mAP on COCO val2017 dataset with 520 FPS on T4 using TensorRT FP16 for bs32 inference.
<img src="assets/speed_comparision_v2.png" width="1000">

YOLOv6 is composed of the following methods:
YOLOv6 has a series of models for various industrial scenarios, including N/T/S/M/L, which the architectures vary considering the model size for better accuracy-speed trade-off. And some Bag-of-freebies methods are introduced to further improve the performance, such as self-distillation and more training epochs. For industrial deployment, we adopt QAT with channel-wise distillation and graph optimization to pursue extreme performance.

- Hardware-friendly Design for Backbone and Neck
- Efficient Decoupled Head with SIoU Loss
YOLOv6-N hits 35.9% AP on COCO dataset with 1234 FPS on T4. YOLOv6-S strikes 43.5% AP with 495 FPS, and the quantized YOLOv6-S model achieves 43.3% AP at a accelerated speed of 869 FPS on T4. YOLOv6-T/M/L also have excellent performance, which show higher accuracy than other detectors with the similar inference speed.


## Coming soon
## What's New

- [ ] YOLOv6 m/l/x model.
- [ ] Deployment for MNN/TNN/NCNN/CoreML...
- [ ] Quantization tools
- Release M/L models and update N/T/S models with enhanced performance.⭐️ [Benchmark](#Benchmark)
- 2x faster training time.
- Fix the degration of performance when evaluating on 640x640 inputs.
- Customized quantization methods. 🚀 [Quantization Tutorial](./tools/qat/README.md)


## Quick Start
Expand All @@ -33,13 +31,12 @@ pip install -r requirements.txt

### Inference

First, download a pretrained model from the YOLOv6 [release](https://github.com/meituan/YOLOv6/releases/tag/0.1.0)
First, download a pretrained model from the YOLOv6 [release](https://github.com/meituan/YOLOv6/releases/tag/0.2.0)

Second, run inference with `tools/infer.py`

```shell
python tools/infer.py --weights yolov6s.pt --source img.jpg / imgdir
yolov6n.pt
python tools/infer.py --weights yolov6s.pt --source img.jpg / imgdir / video.mp4
```

### Training
Expand All @@ -48,58 +45,168 @@ Single GPU

```shell
python tools/train.py --batch 32 --conf configs/yolov6s.py --data data/coco.yaml --device 0
configs/yolov6n.py
```

Multi GPUs (DDP mode recommended)

```shell
python -m torch.distributed.launch --nproc_per_node 8 tools/train.py --batch 256 --conf configs/yolov6s.py --data data/coco.yaml --device 0,1,2,3,4,5,6,7
configs/yolov6n.py
```

- conf: select config file to specify network/optimizer/hyperparameters
- data: prepare [COCO](http://cocodataset.org) dataset and specify dataset paths in data.yaml


<details>
<summary>Reproduce our results on COCO</summary>

For nano model
```shell
python -m torch.distributed.launch --nproc_per_node 4 tools/train.py \
--batch 128 \
--conf configs/yolov6n.py \
--data data/coco.yaml \
--epoch 400 \
--device 0,1,2,3 \
--name yolov6n_coco
```

For s/tiny model
```shell
python -m torch.distributed.launch --nproc_per_node 8 tools/train.py \
--batch 256 \
--conf configs/yolov6s.py \ # configs/yolov6t.py
--data data/coco.yaml \
--epoch 400 \
--device 0,1,2,3,4,5,6,7 \
--name yolov6s_coco # yolov6t_coco
```

For m/l model
```shell
# Step 1: Training a base model
python -m torch.distributed.launch --nproc_per_node 8 tools/train.py \
--batch 256 \
--conf configs/yolov6m.py \ # configs/yolov6l.py
--data data/coco.yaml \
--epoch 300 \
--device 0,1,2,3,4,5,6,7 \
--name yolov6m_coco # yolov6l_coco


# Step 2: Self-distillation training
python -m torch.distributed.launch --nproc_per_node 8 tools/train.py \
--batch 256 \ # 128 for distillation of yolov6l
--conf configs/yolov6m.py \ # configs/yolov6l.py
--data data/coco.yaml \
--epoch 300 \
--device 0,1,2,3,4,5,6,7 \
--distill \
--teacher_model_path runs/train/yolov6m_coco/weights/best_ckpt.pt \ # # yolov6l_coco
--name yolov6m_coco # yolov6l_coco

```
</details>

- conf: select config file to specify network/optimizer/hyperparameters
- data: prepare [COCO](http://cocodataset.org) dataset, [YOLO format coco labels](https://github.com/meituan/YOLOv6/releases/download/0.1.0/coco2017labels.zip) and specify dataset paths in data.yaml
- make sure your dataset structure as follows:
```
├── coco
│ ├── annotations
│ │ ├── instances_train2017.json
│ │ └── instances_val2017.json
│ ├── images
│ │ ├── train2017
│ │ └── val2017
│ ├── labels
│ │ ├── train2017
│ │ ├── val2017
│ ├── LICENSE
│ ├── README.txt
```

### Evaluation

Reproduce mAP on COCO val2017 dataset
Reproduce mAP on COCO val2017 dataset with 640×640 resolution

```shell
python tools/eval.py --data data/coco.yaml --batch 32 --weights yolov6s.pt --task val
yolov6n.pt
python tools/eval.py --data data/coco.yaml --batch 32 --weights yolov6s.pt --task val --reproduce_640_eval
```


<details>
<summary>Resume training</summary>

If your training process is corrupted, you can resume training by
```
# multi GPU training.
python -m torch.distributed.launch --nproc_per_node 8 tools/train.py --resume
```
Your can also specify a checkpoint path to `--resume` parameter by
```
# remember to replace /path/to/your/checkpoint/path to the checkpoint path which you want to resume training.
--resume /path/to/your/checkpoint/path
```

</details>

### Deployment

* [ONNX](./deploy/ONNX)
* [OpenCV Python/C++](./deploy/ONNX/OpenCV)
* [OpenVINO](./deploy/OpenVINO)
* [Partial Quantization](./tools/partial_quantization)
* [TensorRT](./deploy/TensorRT)

### Tutorials

* [Train custom data](./docs/Train_custom_data.md)
* [Test speed](./docs/Test_speed.md)
* [Tutorial of RepOpt for YOLOv6](./docs/tutorial_repopt.md)
* [Tutorial of QAT for YOLOv6](./tools/qat/README.md)


## Benchmark


| Model | Size | mAP<sup>val<br/>0.5:0.95 | Speed<sup>V100<br/>fp16 b32 <br/>(ms) | Speed<sup>V100<br/>fp32 b32 <br/>(ms) | Speed<sup>T4<br/>trt fp16 b1 <br/>(fps) | Speed<sup>T4<br/>trt fp16 b32 <br/>(fps) | Params<br/><sup> (M) | Flops<br/><sup> (G) |
| Model | Size | mAP<sup>val<br/>0.5:0.95 | Speed<sup>T4<br/>trt fp16 b1 <br/>(fps) | Speed<sup>T4<br/>trt fp16 b32 <br/>(fps) | Params<br/><sup> (M) | FLOPs<br/><sup> (G) |
| :----------------------------------------------------------- | ---- | :------------------------------ | --------------------------------------- | ---------------------------------------- | -------------------- | ------------------- |
| [**YOLOv6-N**](https://github.com/meituan/YOLOv6/releases/download/0.2.0/yolov6n.pt) | 640 | 35.9<sup>300e</sup><br/>36.3<sup>400e | 802 | 1234 | 4.3 | 11.1 |
| [**YOLOv6-T**](https://github.com/meituan/YOLOv6/releases/download/0.2.0/yolov6t.pt) | 640 | 40.3<sup>300e</sup><br/>41.1<sup>400e | 449 | 659 | 15.0 | 36.7 |
| [**YOLOv6-S**](https://github.com/meituan/YOLOv6/releases/download/0.2.0/yolov6s.pt) | 640 | 43.5<sup>300e</sup><br/>43.8<sup>400e | 358 | 495 | 17.2 | 44.2 |
| [**YOLOv6-M**](https://github.com/meituan/YOLOv6/releases/download/0.2.0/yolov6m.pt) | 640 | 49.5 | 179 | 233 | 34.3 | 82.2 |
| [**YOLOv6-L-ReLU**](https://github.com/meituan/YOLOv6/releases/download/0.2.0/yolov6l_relu.pt) | 640 | 51.7 | 113 | 149 | 58.5 | 144.0 |
| [**YOLOv6-L**](https://github.com/meituan/YOLOv6/releases/download/0.2.0/yolov6l.pt) | 640 | 52.5 | 98 | 121 | 58.5 | 144.0 |

<details>
<summary>Legacy models</summary>

| Model | Size | mAP<sup>val<br/>0.5:0.95 | Speed<sup>V100<br/>fp16 b32 <br/>(ms) | Speed<sup>V100<br/>fp32 b32 <br/>(ms) | Speed<sup>T4<br/>trt fp16 b1 <br/>(fps) | Speed<sup>T4<br/>trt fp16 b32 <br/>(fps) | Params<br/><sup> (M) | FLOPs<br/><sup> (G) |
| :-------------- | ----------- | :----------------------- | :------------------------------------ | :------------------------------------ | ---------------------------------------- | ----------------------------------------- | --------------- | -------------- |
| [**YOLOv6-n**](https://github.com/meituan/YOLOv6/releases/download/0.1.0/yolov6n.pt) | 416<br/>640 | 30.8<br/>35.0 | 0.3<br/>0.5 | 0.4<br/>0.7 | 1100<br/>788 | 2716<br/>1242 | 4.3<br/>4.3 | 4.7<br/>11.1 |
| [**YOLOv6-tiny**](https://github.com/meituan/YOLOv6/releases/download/0.1.0/yolov6t.pt) | 640 | 41.3 | 0.9 | 1.5 | 425 | 602 | 15.0 | 36.7 |
| [**YOLOv6-s**](https://github.com/meituan/YOLOv6/releases/download/0.1.0/yolov6s.pt) | 640 | 43.1 | 1.0 | 1.7 | 373 | 520 | 17.2 | 44.2 |
| [**YOLOv6-N**](https://github.com/meituan/YOLOv6/releases/download/0.1.0/yolov6n.pt) | 416<br/>640 | 30.8<br/>35.0 | 0.3<br/>0.5 | 0.4<br/>0.7 | 1100<br/>788 | 2716<br/>1242 | 4.3<br/>4.3 | 4.7<br/>11.1 |
| [**YOLOv6-T**](https://github.com/meituan/YOLOv6/releases/download/0.1.0/yolov6t.pt) | 640 | 41.3 | 0.9 | 1.5 | 425 | 602 | 15.0 | 36.7 |
| [**YOLOv6-S**](https://github.com/meituan/YOLOv6/releases/download/0.1.0/yolov6s.pt) | 640 | 43.1 | 1.0 | 1.7 | 373 | 520 | 17.2 | 44.2 |


</details>

- Comparisons of the mAP and speed of different object detectors are tested on [COCO val2017](https://cocodataset.org/#download) dataset.
- Results of the mAP and speed are evaluated on [COCO val2017](https://cocodataset.org/#download) dataset with the input resolution of 640×640.
- Refer to [Test speed](./docs/Test_speed.md) tutorial to reproduce the speed results of YOLOv6.
- Params and Flops of YOLOv6 are estimated on deployed model.
- Speed results of other methods are tested in our environment using official codebase and model if not found from the corresponding official release.
- Params and FLOPs of YOLOv6 are estimated on deployed models.
- For N/T/S models, we use more training epochs strategy.
- For M/L/L-ReLU models, we adopt self-distillation methods to further improve the performance.



## Third-party resources
* YOLOv6 NCNN Android app demo: [ncnn-android-yolov6](https://github.com/FeiGeChuanShu/ncnn-android-yolov6) from [FeiGeChuanShu](https://github.com/FeiGeChuanShu)
* YOLOv6 ONNXRuntime/MNN/TNN C++: [YOLOv6-ORT](https://github.com/DefTruth/lite.ai.toolkit/blob/main/lite/ort/cv/yolov6.cpp), [YOLOv6-MNN](https://github.com/DefTruth/lite.ai.toolkit/blob/main/lite/mnn/cv/mnn_yolov6.cpp) and [YOLOv6-TNN](https://github.com/DefTruth/lite.ai.toolkit/blob/main/lite/tnn/cv/tnn_yolov6.cpp) from [DefTruth](https://github.com/DefTruth)
* YOLOv6 TensorRT Python: [yolov6-tensorrt-python](https://github.com/Linaom1214/tensorrt-python/blob/main/yolov6/trt.py) from [Linaom1214](https://github.com/Linaom1214)
* YOLOv6 TensorRT Python: [yolov6-tensorrt-python](https://github.com/Linaom1214/TensorRT-For-YOLO-Series) from [Linaom1214](https://github.com/Linaom1214)
* YOLOv6 TensorRT Windows C++: [yolort](https://github.com/zhiqwang/yolov5-rt-stack/tree/main/deployment/tensorrt-yolov6) from [Wei Zeng](https://github.com/Wulingtian)
* YOLOv6 Quantization and Auto Compression Example [YOLOv6-ACT](https://github.com/PaddlePaddle/PaddleSlim/tree/develop/example/auto_compression/pytorch_yolov6) from [PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim)
* [YOLOv6 web demo](https://huggingface.co/spaces/nateraw/yolov6) on [Huggingface Spaces](https://huggingface.co/spaces) with [Gradio](https://github.com/gradio-app/gradio). [![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/nateraw/yolov6)
* Tutorial: [How to train YOLOv6 on a custom dataset](https://blog.roboflow.com/how-to-train-yolov6-on-a-custom-dataset/) <a href="https://colab.research.google.com/drive/1YnbqOinBZV-c9I7fk_UL6acgnnmkXDMM"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"></a>
* Demo of YOLOv6 inference on Google Colab [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/mahdilamb/YOLOv6/blob/main/inference.ipynb)


### [FAQ(Continuously updated)](https://github.com/meituan/YOLOv6/wiki/FAQ%EF%BC%88Continuously-updated%EF%BC%89)
Binary file added assets/image3.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/speed_comparision_v2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/train_batch.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/voc_loss_curve.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/yolov5s.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/yolov6s.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/yoloxs.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
60 changes: 60 additions & 0 deletions configs/experiment/eval_640_repro.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# eval param for different scale

eval_params = dict(
default = dict(
img_size=640,
test_load_size=634,
letterbox_return_int=True,
scale_exact=True,
force_no_pad=True,
not_infer_on_rect=True,
),
yolov6n = dict(
img_size=640,
test_load_size=638,
letterbox_return_int=True,
scale_exact=True,
force_no_pad=True,
not_infer_on_rect=True,
),
yolov6t = dict(
img_size=640,
test_load_size=634,
letterbox_return_int=True,
scale_exact=True,
force_no_pad=True,
not_infer_on_rect=True,
),
yolov6s = dict(
img_size=640,
test_load_size=638,
letterbox_return_int=True,
scale_exact=True,
force_no_pad=True,
not_infer_on_rect=True,
),
yolov6m = dict(
img_size=640,
test_load_size=628,
letterbox_return_int=True,
scale_exact=True,
force_no_pad=True,
not_infer_on_rect=True,
),
yolov6l = dict(
img_size=640,
test_load_size=632,
letterbox_return_int=True,
scale_exact=True,
force_no_pad=True,
not_infer_on_rect=True,
),
yolov6l_relu = dict(
img_size=640,
test_load_size=638,
letterbox_return_int=True,
scale_exact=True,
force_no_pad=True,
not_infer_on_rect=True,
)
)
76 changes: 76 additions & 0 deletions configs/experiment/yolov6n_with_eval_params.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# YOLOv6n model with eval param(when traing)
model = dict(
type='YOLOv6n',
pretrained=None,
depth_multiple=0.33,
width_multiple=0.25,
backbone=dict(
type='EfficientRep',
num_repeats=[1, 6, 12, 18, 6],
out_channels=[64, 128, 256, 512, 1024],
),
neck=dict(
type='RepPANNeck',
num_repeats=[12, 12, 12, 12],
out_channels=[256, 128, 128, 256, 256, 512],
),
head=dict(
type='EffiDeHead',
in_channels=[128, 256, 512],
num_layers=3,
begin_indices=24,
anchors=1,
out_indices=[17, 20, 23],
strides=[8, 16, 32],
iou_type='siou',
use_dfl=False,
reg_max=0 #if use_dfl is False, please set reg_max to 0
)
)

solver = dict(
optim='SGD',
lr_scheduler='Cosine',
lr0=0.02, #0.01 # 0.02
lrf=0.01,
momentum=0.937,
weight_decay=0.0005,
warmup_epochs=3.0,
warmup_momentum=0.8,
warmup_bias_lr=0.1
)

data_aug = dict(
hsv_h=0.015,
hsv_s=0.7,
hsv_v=0.4,
degrees=0.0,
translate=0.1,
scale=0.5,
shear=0.0,
flipud=0.0,
fliplr=0.5,
mosaic=1.0,
mixup=0.0,
)

eval_params = dict(
img_size=None, #None mean will be the same as train image size
conf_thres=0.03,
iou_thres=0.65,
task='train',

#pading and scale coord
test_load_size=None, #None mean will be the same as test image size
letterbox_return_int=False,
force_no_pad=False,
not_infer_on_rect=False,
scale_exact=False,

#metric
verbose=False,
do_coco_metric=True,
do_pr_metric=False,
plot_curve=False,
plot_confusion_matrix=False
)
Loading

0 comments on commit a4186a8

Please sign in to comment.