This repository is my reproduction of classical object detection in pytorch. (For own study and reference others' implementation --- welcome to pull issues 😳)
- SSD: Single Shot MultiBox Detector
- YOLO v2
- YOLO v3(without training)
Note:This implement is mainly based on the amdegroot's ssd 👍.
Detail of instructions are in each sub-direcory:ssd,yolo,yolo3
- Python 3.4+
- Pytorch 0.4 (Note:you can install through source or through follow conda)
conda install -c ostrokach-forge pytorch=0.4.0
- OpenCV(optional,PIL.Image is also well)
- CUDA 8.0 or higher(optional)
PASCAL_VOC 07+12:follow the instructions in amdegroot's ssd
-
Download VOC2007 trainval & test
# specify a directory for dataset to be downloaded into, else default is ~/data/ sh dataset/scripts/VOC2007.sh # <directory>
-
Download VOC2012 trainval
# specify a directory for dataset to be downloaded into, else default is ~/data/ sh dataset/scripts/VOC2012.sh # <directory>
(Note:if your dataset is not in ~/data
,please modify the dataset/config.py
’ s home parameter to you data path.)
detection model | mAP(07) | mAP(10) | Google Drive | Baidu Drive |
---|---|---|---|---|
SSD (vgg16) | 77.55% | 80.10% | vgg_final.pth | vgg_final.pth |
SSD (res101) | 75.97% | 78.26% | resnet_final.pth | resnet_final.pth |
YOLOv2 (official) | 73.40% | 75.80% | yolo-voc.pth | yolo-voc.pth |
YOLOv2 (w/o multi) | 67.73% | 69.61% | yolo_160.pth | yolo_160.pth |
YOLOv3(official) | yolo3.pth | yolo3.pth |
Note:
- The pretrained vgg model is converted from caffe and download from amdegroot's ssd,and the pretrained res101 is coming from torchvision pretrained models.(I guess this is the reason why res101 based performance is worse than vgg based)
- YOLOv2 official means the weights coming from the pjreddie's website(can not find now 😂 )
- The data in ssd minus the mean and not divide 255. However, in the YOLO, the data without minus mean and divide 255. (No why,due to the pretrained basenet 😅)
- YOLO using multi-scale may need more epoch.
There are sevaral important "functions" not cantain in this repository:
- Only VOC dataset,not support other datasets (e.g. COCO dataset)
- Only one card(GPU),not support multiprocess
(I am sorry for those. I only have one GPU card,and cannot finish the above functions~:neutral_face:)
Thanks for the great work by these authors.:heart: