Template based object detection.
This code is tested on Ubuntu 20.04 with Python 3.8.5 and PyTorch 1.10.1. We recommend using a virtual environment.
python3 -m venv env
source env/bin/activate
To install the required packages, run the following command:
pip install -r requirements.txt
You also need to install the Deformable Transformer CUDA packages. To do so navigate to "models/ops" and run the following commands:
To use the network you can either use the pretrained weights we provide or train the network yourself.
The pretrained model weights may be downloaded from the following link:
To train the network you first need to setup your dataset. The dataset annotation file is a modified version of a COCO dataset. The dataset should have the following folder structure:
├── images/
| ├── img_1.{ext}
| ├── img_2.{ext}
├── depth/
| ├── img_1.{ext}
| ├── img_2.{ext}
├── targets/
| ├── {tgt_class_1}
| | ├── tgt_img_1.{ext}
| ├── {tgt_class_2}
| | ├── tgt_img_1.{ext}
Annotations.json is a modified version of COCO annotations. It has the following structure (Filename should include the extension):
"images": [ {"id": int, "file_name": str, "depth_name": str, "width": int, "height": int}, ... ],
"annotations": [{"id": int, "image_id": int, "category_id": int, "bbox": [x,y,w,h], "area": int, "iscrowd": float, "sup_id": int}, ... ],
"categories": [{"id": int, "name": str, "supercategory": str, "sup_id": int}, ... ],
Next set up your datasets in the config file (configs/visual_focusnet_config.py). You can set the "TRAIN_DATASETS" and "VAL_DATASETS" variables. Each variable should contain paths to your annotation.json files.
In the config file you can also set other network parameters.
To train the network run the following command:
python train_detr.py --save_dir <path_to_save_dir> --load_dir <OPTIONAL: path_to_load_dir>
To test the network on your data you can use "test_detr.py" file. First prepare your data as shown in the "example/data" folder. Next run the testing:
python test_detr.py --load_dir <path_to_load_dir> --save_dir <path_to_save_dir> --data_dir <path_to_data_dir>
If you use this code for your research, please cite my thesis:
** Suction Grasp Generation and Template-Based object detection **
