Skip to content

Commit

Permalink
update ssd code
Browse files Browse the repository at this point in the history
  • Loading branch information
wz authored and wz committed Jun 29, 2020
1 parent c667ba6 commit c1a8602
Show file tree
Hide file tree
Showing 9 changed files with 269 additions and 18 deletions.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,8 @@
* MobileNet(已完成)
* ShuffleNet (准备中)
* 目标识别检测
* Faster RCNN(进行中)
* SSD (准备中)
* Faster RCNN/FPN(进行中)
* SSD/RetinaNet (进行中)
* YOLO v3
* 目标分割

Expand Down
2 changes: 2 additions & 0 deletions pytorch_object_detection/faster_rcnn/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
* 最好使用GPU训练

## 文件结构:
```
* ├── backbone: 特征提取网络,可以根据自己的要求选择
* ├── network_files: Faster R-CNN网络(包括Fast R-CNN以及RPN等模块)
* ├── train_utils: 训练验证相关模块(包括cocotools)
Expand All @@ -17,6 +18,7 @@
* ├── train_multi_GPU.py: 针对使用多GPU的用户使用
* ├── predict.py: 简易的预测脚本,使用训练好的权重进行预测测试
* ├── pascal_voc_classes.json: pascal_voc标签文件
```

## 预训练权重下载地址(下载后放入backbone文件夹中):
* MobileNetV2 backbone: https://download.pytorch.org/models/mobilenet_v2-b0353104.pth
Expand Down
46 changes: 45 additions & 1 deletion pytorch_object_detection/ssd/README.md
Original file line number Diff line number Diff line change
@@ -1 +1,45 @@
# 代码完善中,敬请期待...
# SSD: Single Shot MultiBox Detector

## 环境配置:
* Python 3.6或者3.7
* Pytorch 1.5(注意:是1.5)
* pycocotools(Linux: pip install pycocotools;
Windows:pip install pycocotools-windows(不需要额外安装vs))
* Ubuntu或Centos(不建议Windows)
* 最好使用GPU训练

## 文件结构:
```
├── src: 实现SSD模型的相关模块
│ ├── resnet50_backbone.py 使用resnet50网络作为SSD的backbone
│ ├── ssd_model.py SSD网络结构文件
│ └── utils.py 训练过程中使用到的一些功能实现
├── train_utils: 训练验证相关模块(包括cocotools)
├── my_dataset.py: 自定义dataset用于读取VOC数据集
├── train_ssd300.py: 以resnet50做为backbone的SSD网络进行训练
├── train_multi_GPU.py: 针对使用多GPU的用户使用
├── predict_test.py: 简易的预测脚本,使用训练好的权重进行预测测试
├── pascal_voc_classes.json: pascal_voc标签文件
├── plot_curve.py: 用于绘制训练过程的损失以及验证集的mAP
```

## 预训练权重下载地址(下载后放入src文件夹中):
* ResNet50+SSD: https://ngc.nvidia.com/catalog/models
`搜索ssd -> 找到SSD for PyTorch(FP32) -> download FP32 -> 解压文件`

## 数据集,本例程使用的是PASCAL VOC2012数据集(下载后放入项目当前文件夹中)
* Pascal VOC2012 train/val数据集下载地址:http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
* Pascal VOC2007 test数据集请参考:http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
* 如果不了解数据集或者想使用自己的数据集进行训练,请参考我的bilibili:https://b23.tv/F1kSCK

## 训练方法
* 确保提前准备好数据集
* 确保提前下载好对应预训练模型权重
* 单GPU训练或CPU,直接使用train_ssd300.py训练脚本
* 若要使用多GPU训练,使用 "python -m torch.distributed.launch --nproc_per_node=8 --use_env train_multi_GPU.py" 指令,nproc_per_node参数为使用GPU数量

## 如果对SSD算法原理不是很理解可参考我的bilibili
* https://b23.tv/GJnkOD

## 进一步了解该项目,以及对SSD算法代码的分析可参考我的bilibili

95 changes: 95 additions & 0 deletions pytorch_object_detection/ssd/draw_box_utils.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
import collections
import PIL.ImageDraw as ImageDraw
import PIL.ImageFont as ImageFont
import numpy as np

STANDARD_COLORS = [
'AliceBlue', 'Chartreuse', 'Aqua', 'Aquamarine', 'Azure', 'Beige', 'Bisque',
'BlanchedAlmond', 'BlueViolet', 'BurlyWood', 'CadetBlue', 'AntiqueWhite',
'Chocolate', 'Coral', 'CornflowerBlue', 'Cornsilk', 'Crimson', 'Cyan',
'DarkCyan', 'DarkGoldenRod', 'DarkGrey', 'DarkKhaki', 'DarkOrange',
'DarkOrchid', 'DarkSalmon', 'DarkSeaGreen', 'DarkTurquoise', 'DarkViolet',
'DeepPink', 'DeepSkyBlue', 'DodgerBlue', 'FireBrick', 'FloralWhite',
'ForestGreen', 'Fuchsia', 'Gainsboro', 'GhostWhite', 'Gold', 'GoldenRod',
'Salmon', 'Tan', 'HoneyDew', 'HotPink', 'IndianRed', 'Ivory', 'Khaki',
'Lavender', 'LavenderBlush', 'LawnGreen', 'LemonChiffon', 'LightBlue',
'LightCoral', 'LightCyan', 'LightGoldenRodYellow', 'LightGray', 'LightGrey',
'LightGreen', 'LightPink', 'LightSalmon', 'LightSeaGreen', 'LightSkyBlue',
'LightSlateGray', 'LightSlateGrey', 'LightSteelBlue', 'LightYellow', 'Lime',
'LimeGreen', 'Linen', 'Magenta', 'MediumAquaMarine', 'MediumOrchid',
'MediumPurple', 'MediumSeaGreen', 'MediumSlateBlue', 'MediumSpringGreen',
'MediumTurquoise', 'MediumVioletRed', 'MintCream', 'MistyRose', 'Moccasin',
'NavajoWhite', 'OldLace', 'Olive', 'OliveDrab', 'Orange', 'OrangeRed',
'Orchid', 'PaleGoldenRod', 'PaleGreen', 'PaleTurquoise', 'PaleVioletRed',
'PapayaWhip', 'PeachPuff', 'Peru', 'Pink', 'Plum', 'PowderBlue', 'Purple',
'Red', 'RosyBrown', 'RoyalBlue', 'SaddleBrown', 'Green', 'SandyBrown',
'SeaGreen', 'SeaShell', 'Sienna', 'Silver', 'SkyBlue', 'SlateBlue',
'SlateGray', 'SlateGrey', 'Snow', 'SpringGreen', 'SteelBlue', 'GreenYellow',
'Teal', 'Thistle', 'Tomato', 'Turquoise', 'Violet', 'Wheat', 'White',
'WhiteSmoke', 'Yellow', 'YellowGreen'
]


def filter_low_thresh(boxes, scores, classes, category_index, thresh, box_to_display_str_map, box_to_color_map):
for i in range(boxes.shape[0]):
if scores[i] > thresh:
box = tuple(boxes[i].tolist()) # numpy -> list -> tuple
if classes[i] in category_index.keys():
class_name = category_index[classes[i]]
else:
class_name = 'N/A'
display_str = str(class_name)
display_str = '{}: {}%'.format(display_str, int(100 * scores[i]))
box_to_display_str_map[box].append(display_str)
box_to_color_map[box] = STANDARD_COLORS[
classes[i] % len(STANDARD_COLORS)]
else:
break # 网络输出概率已经排序过,当遇到一个不满足后面的肯定不满足


def draw_text(draw, box_to_display_str_map, box, left, right, top, bottom, color):
try:
font = ImageFont.truetype('arial.ttf', 24)
except IOError:
font = ImageFont.load_default()

# If the total height of the display strings added to the top of the bounding
# box exceeds the top of the image, stack the strings below the bounding box
# instead of above.
display_str_heights = [font.getsize(ds)[1] for ds in box_to_display_str_map[box]]
# Each display_str has a top and bottom margin of 0.05x.
total_display_str_height = (1 + 2 * 0.05) * sum(display_str_heights)

if top > total_display_str_height:
text_bottom = top
else:
text_bottom = bottom + total_display_str_height
# Reverse list and print from bottom to top.
for display_str in box_to_display_str_map[box][::-1]:
text_width, text_height = font.getsize(display_str)
margin = np.ceil(0.05 * text_height)
draw.rectangle([(left, text_bottom - text_height - 2 * margin),
(left + text_width, text_bottom)], fill=color)
draw.text((left + margin, text_bottom - text_height - margin),
display_str,
fill='black',
font=font)
text_bottom -= text_height - 2 * margin


def draw_box(image, boxes, classes, scores, category_index, thresh=0.5, line_thickness=8):
box_to_display_str_map = collections.defaultdict(list)
box_to_color_map = collections.defaultdict(str)

filter_low_thresh(boxes, scores, classes, category_index, thresh, box_to_display_str_map, box_to_color_map)

# Draw all boxes onto image.
draw = ImageDraw.Draw(image)
im_width, im_height = image.size
for box, color in box_to_color_map.items():
xmin, ymin, xmax, ymax = box
(left, right, top, bottom) = (xmin * 1, xmax * 1,
ymin * 1, ymax * 1)
draw.line([(left, top), (left, bottom), (right, bottom),
(right, top), (left, top)], width=line_thickness, fill=color)
draw_text(draw, box_to_display_str_map, box, left, right, top, bottom, color)
72 changes: 72 additions & 0 deletions pytorch_object_detection/ssd/predict_test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
import torch
from draw_box_utils import draw_box
from PIL import Image
import json
import matplotlib.pyplot as plt
from src.ssd_model import SSD300, Backbone
import transform


def create_model(num_classes):
backbone = Backbone()
model = SSD300(backbone=backbone, num_classes=num_classes)

return model


# get devices
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(device)

# create model
model = create_model(num_classes=21)

# load train weights
train_weights = "./save_weights/ssd300-15.pth"
train_weights_dict = torch.load(train_weights, map_location=device)['model']

model.load_state_dict(train_weights_dict, strict=False)
model.to(device)

# read class_indict
category_index = {}
try:
json_file = open('./pascal_voc_classes.json', 'r')
class_dict = json.load(json_file)
category_index = {v: k for k, v in class_dict.items()}
except Exception as e:
print(e)
exit(-1)

# load image
original_img = Image.open("./test.jpg")

# from pil image to tensor, do not normalize image
data_transform = transform.Compose([transform.Resize(),
transform.ToTensor(),
transform.Normalization()])
img, _ = data_transform(original_img)
# expand batch dimension
img = torch.unsqueeze(img, dim=0)

model.eval()
with torch.no_grad():
predictions = model(img.to(device))[0] # bboxes_out, labels_out, scores_out
predict_boxes = predictions[0].to("cpu").numpy()
predict_boxes[:, [0, 2]] = predict_boxes[:, [0, 2]] * original_img.size[0]
predict_boxes[:, [1, 3]] = predict_boxes[:, [1, 3]] * original_img.size[1]
predict_classes = predictions[1].to("cpu").numpy()
predict_scores = predictions[2].to("cpu").numpy()

if len(predict_boxes) == 0:
print("没有检测到任何目标!")

draw_box(original_img,
predict_boxes,
predict_classes,
predict_scores,
category_index,
thresh=0.5,
line_thickness=5)
plt.imshow(original_img)
plt.show()
4 changes: 3 additions & 1 deletion pytorch_object_detection/ssd/src/ssd_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ def bbox_view(self, features, loc_extractor, conf_extractor):
locs, confs = torch.cat(locs, 2).contiguous(), torch.cat(confs, 2).contiguous()
return locs, confs

def forward(self, image, targets):
def forward(self, image, targets=None):
x = self.feature_extractor(image)

# Feature Map 38x38x1024, 19x19x512, 10x10x512, 5x5x256, 3x3x256, 1x1x256
Expand All @@ -117,6 +117,8 @@ def forward(self, image, targets):
# 38x38x4 + 19x19x6 + 10x10x6 + 5x5x6 + 3x3x4 + 1x1x4 = 8732

if self.training:
if targets is None:
raise ValueError("In training mode, targets should be passed")
# bboxes_out (Tensor 8732 x 4), labels_out (Tensor 8732)
bboxes_out = targets['boxes']
bboxes_out = bboxes_out.transpose(1, 2).contiguous()
Expand Down
60 changes: 48 additions & 12 deletions pytorch_object_detection/ssd/train_ssd300.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,11 @@

def create_model(num_classes=21, device=torch.device('cpu')):
# https://download.pytorch.org/models/resnet50-19c8e357.pth
pre_train_path = "./src/resnet50.pth"
backbone = Backbone(pretrain_path=pre_train_path)
# pre_train_path = "./src/resnet50.pth"
backbone = Backbone()
model = SSD300(backbone=backbone, num_classes=num_classes)

# https://ngc.nvidia.com/catalog/models -> search ssd -> download FP32
pre_ssd_path = "./src/nvidia_ssdpyt_fp32.pt"
pre_model_dict = torch.load(pre_ssd_path, map_location=device)
pre_weights_dict = pre_model_dict["model"]
Expand All @@ -33,8 +34,8 @@ def create_model(num_classes=21, device=torch.device('cpu')):
return model


def main():
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
def main(parser_data):
device = torch.device(parser_data.device if torch.cuda.is_available() else "cpu")
print(device)

if not os.path.exists("save_weights"):
Expand All @@ -53,16 +54,16 @@ def main():
transform.Normalization()])
}

voc_path = "../"
train_dataset = VOC2012DataSet(voc_path, data_transform['train'], True)
VOC_root = parser_data.data_path
train_dataset = VOC2012DataSet(VOC_root, data_transform['train'], True)
# 注意训练时,batch_size必须大于1
train_data_loader = torch.utils.data.DataLoader(train_dataset,
batch_size=8,
shuffle=True,
num_workers=0,
num_workers=4,
collate_fn=utils.collate_fn)

val_dataset = VOC2012DataSet(voc_path, data_transform['val'], False)
val_dataset = VOC2012DataSet(VOC_root, data_transform['val'], False)
val_data_loader = torch.utils.data.DataLoader(val_dataset,
batch_size=1,
shuffle=False,
Expand All @@ -74,26 +75,35 @@ def main():

# define optimizer
params = [p for p in model.parameters() if p.requires_grad]
optimizer = torch.optim.SGD(params, lr=0.002,
optimizer = torch.optim.SGD(params, lr=0.0005,
momentum=0.9, weight_decay=0.0005)
# learning rate scheduler
lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer,
step_size=5,
gamma=0.3)

# 如果指定了上次训练保存的权重文件地址,则接着上次结果接着训练
if parser_data.resume != "":
checkpoint = torch.load(parser_data.resume)
model.load_state_dict(checkpoint['model'])
optimizer.load_state_dict(checkpoint['optimizer'])
lr_scheduler.load_state_dict(checkpoint['lr_scheduler'])
parser_data.start_epoch = checkpoint['epoch'] + 1
print("the training process from epoch{}...".format(parser_data.start_epoch))

train_loss = []
learning_rate = []
val_map = []

val_data = None
# 如果电脑内存充裕,可提前加载验证集数据,以免每次验证时都要重新加载一次数据,节省时间
# val_data = get_coco_api_from_dataset(val_data_loader.dataset)
for epoch in range(20):
for epoch in range(parser_data.start_epoch, parser_data.epochs):
utils.train_one_epoch(model=model, optimizer=optimizer,
data_loader=train_data_loader,
device=device, epoch=epoch,
print_freq=50, train_loss=train_loss,
train_lr=learning_rate, warmup=True)
train_lr=learning_rate)

lr_scheduler.step()

Expand Down Expand Up @@ -124,4 +134,30 @@ def main():


if __name__ == '__main__':
main()
import argparse

parser = argparse.ArgumentParser(
description=__doc__)

# 训练设备类型
parser.add_argument('--device', default='cuda:0', help='device')
# 训练数据集的根目录
parser.add_argument('--data-path', default='./', help='dataset')
# 文件保存地址
parser.add_argument('--output-dir', default='./save_weights', help='path where to save')
# 若需要接着上次训练,则指定上次训练保存权重文件地址
parser.add_argument('--resume', default='', type=str, help='resume from checkpoint')
# 指定接着从哪个epoch数开始训练
parser.add_argument('--start_epoch', default=0, type=int, help='start epoch')
# 训练的总epoch数
parser.add_argument('--epochs', default=15, type=int, metavar='N',
help='number of total epochs to run')

args = parser.parse_args()
print(args)

# 检查保存权重文件夹是否存在,不存在则创建
if not os.path.exists(args.output_dir):
os.makedirs(args.output_dir)

main(args)
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ def train_one_epoch(model, optimizer, data_loader, device, epoch, print_freq,

lr_scheduler = None
if epoch == 0 and warmup is True: # 当训练第一轮(epoch=0)时,启用warmup训练方式,可理解为热身训练
warmup_factor = 1.0 / 1000
warmup_factor = 5.0 / 10000
warmup_iters = min(1000, len(data_loader) - 1)

lr_scheduler = warmup_lr_scheduler(optimizer, warmup_iters, warmup_factor)
Expand Down
2 changes: 1 addition & 1 deletion pytorch_object_detection/ssd/transform.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ class Compose(object):
def __init__(self, transforms):
self.transforms = transforms

def __call__(self, image, target):
def __call__(self, image, target=None):
for trans in self.transforms:
image, target = trans(image, target)
return image, target
Expand Down

0 comments on commit c1a8602

Please sign in to comment.