调用seq2jpg.py文件,输入data文件夹,输出到JPEG文件夹中,解析后的图片会是
调用vbb2voc.py文件,输入annotations文件夹,输出到xmlresult文件夹中。
调用mergeimg.py和mergexml.py文件。
按照“xxxxxx”这样的6位数字索引命名JPEG图片文件以及对应的XML文件。
##5.生成4个txt文件指定训练集、验证集、数据集、训练验证集
调用generateTXT.py文件,输入xmlresult文件夹,输出到ImageSets/Main文件夹中。
Caltech的标注里有很多别的类别的行人,people,person,findPeople.py是将people标签替换成person。这是一个辅助文件,不是必须用到的。
IMPORTANT NOTES I found there're some errors in the vbb files. To be specific, some bounding boxes are out of the image such as the xmax is greater than the width of the image, which will cause big mistakes in training faster R-CNN. Therefore, you can mannually check the wrong xml files during training. Or I will try to correct the wrong bnd boxes later automatically.