Skip to content

[AAAI 2025] Motion Prior Knowledge Learning with Homogeneous Language Descriptions for Moving Infrared Small Target Detection

Notifications You must be signed in to change notification settings

UESTC-nnLab/MoPKL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Motion Prior Knowledge Learning with Homogeneous Language Descriptions for Moving Infrared Small Target Detection

The 39th Annual AAAI Conference on Artificial Intelligence (AAAI 2025)

Datasets (bounding box-based)

  • Datasets are available at ITSDT-15K and IRDST(code: cctd). Or you can download IRDST directly from the website.

  • You need to reorganize these datasets in a format similar to the coco_train_ITSDT.txt and coco_val_ITSDT.txt files we provided (.txt files are used in training). We provide the .txt files for ITSDT-15K and IRDST. For example:

train_annotation_path = '/home/ITSDT-15K/coco_train_ITSDT.txt'
val_annotation_path = '/home/ITSDT-15K/coco_val_ITSDT.txt'
  • Or you can generate a new txt file based on the path of your datasets. .txt files (e.g., coco_train_ITSDT.txt) can be generated from .json files (e.g., instances_train2017.json). We also provide all .json files for ITSDT-15K and IRDST(code: cctd).
python utils_coco/coco_to_txt.py
  • The folder structure should look like this:
ITSDT-15K
├─instances_train2017.json
├─instances_test2017.json
├─coco_train_ITSDT.txt
├─coco_val_ITSDT.txt
├─images
│   ├─1
│   │   ├─0.bmp
│   │   ├─1.bmp
│   │   ├─2.bmp
│   │   ├─ ...
│   ├─2
│   │   ├─0.bmp
│   │   ├─1.bmp
│   │   ├─2.bmp
│   │   ├─ ...
│   ├─3
│   │   ├─ ...

Prerequisite

  • python==3.11.8
  • pytorch==2.1.1
  • torchvision==0.16.1
  • numpy==1.26.4
  • opencv-python==4.9.0.80
  • scipy==1.13
  • Tested on Ubuntu 20.04, with CUDA 11.8, and 1x NVIDIA 3090.

Usage of MoPKL

Language Descriptions

  • We provide encoded language description embedding representations(code: xbet) of ITSDT-15K and IRDST datasets. There are two embedded representations in this file: emb_train_IRDST.pkl and emb_train_IRDST.pkl.

  • We also provide initial language description text files(code: bn38) that you can explore further with vision-language models.

  • Take the ITSDT-15K dataset as an example, modify the path of the dataloader_for_ITSDT for language description embedding representations:

# Path to your emb_train_ITSDT.pkl

description = pickle.load(open('/home/MoPKL/emb_train_ITSDT.pkl', 'rb'))

Train

  • Note: Please use different dataloader for different datasets. For example, to train the model on ITSDT dataset, enter the following command:
CUDA_VISIBLE_DEVICES=0 python train_ITSDT.py 

Test

  • Usually model_best.pth is not necessarily the best model. The best model may have a lower val_loss or a higher AP50 during verification.
"model_path": '/home/MoPKL/logs/model.pth'
  • You need to change the path of the json file of test sets. For example:
# Use ITSDT-15K dataset for test

cocoGt_path         = '/home/public/ITSDT-15K/instances_test2017.json'
dataset_img_path    = '/home/public/ITSDT-15K/'
python test.py

Visulization

  • We support video and single-frame image prediction.
# mode = "video" (predict a sequence)

mode = "predict"  # Predict a single-frame image 
python predict.py

Results

  • For bounding box detection, we use COCO's evaluation metrics:
Method Dataset mAP50 (%) Precision (%) Recall (%) F1 (%) Download
MoPKL ITSDT-15K 79.78 93.29 86.80 89.92 Baidu (code: pchd)
MoPKL IRDST 74.54 89.04 84.74 86.84
  • PR curves on ITSDT-15K and IRDST datasets in this paper.
  • We provide the results (code: 4ves) on ITSDT-15K and IRDST, and you can plot them using Python and matplotlib.

Contact

If any questions, kindly contact with Shengjia Chen via e-mail: [email protected].

References

  1. S. Chen, L. Ji, J. Zhu, M. Ye and X. Yao, "SSTNet: Sliced Spatio-Temporal Network With Cross-Slice ConvLSTM for Moving Infrared Dim-Small Target Detection," in IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1-12, 2024, Art no. 5000912, doi: 10.1109/TGRS.2024.3350024.
  2. Ruigang Fu, Hongqi Fan, Yongfeng Zhu, et al. A dataset for infrared time-sensitive target detection and tracking for air-ground application[DS/OL]. V2. Science Data Bank, 2022[2024-12-10]. https://cstr.cn/31253.11.sciencedb.j00001.00331. CSTR:31253.11.sciencedb.j00001.00331.

Citation

If you find this repo useful, please cite our paper.

About

[AAAI 2025] Motion Prior Knowledge Learning with Homogeneous Language Descriptions for Moving Infrared Small Target Detection

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages