RACnet [ICIP 2024]

Yanan Luo*, Jinhui Yi*, Yazan Abu Farha, Moritz Wolter, Juergen Gall

*Equal Contribution

If you like our project, please give us a star✨ on Github for latest update.

Rethinking temporal self-similarity for repetitive action counting (ICIP2024)

This is the official implementation for paper "Rethinking temporal self-similarity for repetitive action counting"

🚀 News

2025-03-10: The *_feature_npz folders are released. The test script directly on videos is released.
2024-11-14: The Oral video has released on youtube channel. [video]
2024-09-25: The code and pre-trained model are avaliable. [pretrained]
2024-08-09: This paper has been accepted by WICV workshop 2024 in ECCV 2024 as an extended abstract.
2024-07-12: The preprint of the paper is available [paper].
2024-06-07: This paper has been accepted by ICIP 2024 as Oral.

💡 Introduction

We rethink how a temporal self-similarity matrix (TSM) can be utilized for counting repetitive actions and propose a framework (RACnet) that learns embeddings and predicts action start probabilities at full temporal resolution. The number of repeated actions is then inferred from the action start probabilities. We propose a novel loss based on a generated reference TSM, which enforces that the self-similarity of the learned frame-wise embeddings is consistent with the self-similarity of repeated actions.

🔍 Main results on three datasets

Datasets	MAE⬇️	OBO⬆️
RepCountA	0.4441	0.3933
UCFRep	0.5260	0.3714
Countix	0.5278	0.3924

Getting started

🔧 Data preparation

We provide the *_feature_npz folders (without annotation files), please check in RACnet_feature_npy. In this case, you can directly skip into step 5 and start training, testing:)

Download pretrained backbone model: Video Swin Transformer tiny(github).
Feature extractor: extract and flatten the feature map as 7 X 7 X 768 with the backbone model.
Generate the reference TSM.

  cd dataset
  python gen_refTSM.py

Save several arrays into a final file in compressed .npz for each video data.

  # The arrays to be included in the file:
  #    per_frame_features: from step 2
  #    refTSM: from step 3
  #    frame_length: see metadata/RepCountA_frame_length.csv
  #    frame_name: same as above

  import numpy as np
  # RepCountA
  np.savez(file='video_name.npz', img_feature=per_frame_features, gt_tsm=refTSM, length=frame_length)
  # UCFRep and Countix, inference only
  np.savez(file='video_name.npz', img_feature=per_frame_features, length=frame_length, count=count)

Data structure.

  # RepCountA dataset
  # .csv files are the annotations from the original dataset
  RepCountA_feature_npz/
  ├── train.csv
  ├── valid.csv
  ├── test.csv
  ├── train/
  │   ├── video_1.npz
  │   ├── video_2.npz
  │   └── ...
  ├── valid/
  │   ├── video_4.npz
  │   ├── video_5.npz
  │   └── ...
  └── test/
      ├── video_7.npz
      ├── video_8.npz
      └── ...
  
  # UCFRep and Countix dataset
  *_feature_npz/
  ├── test.csv
  └── test/
      ├── video_7.npz
      ├── video_8.npz
      └── ...

Note:

We will upload the *_feature_npz folders soon.
Countix is a subset of Kinetics. Some videos are not avaliable at test time any more, we provide the features of avalible videos in Countix_feature_npz.

🔧 Enviroment

Recommend to use conda virtual env.

  conda create -n racnet python=3.11.5 -y
  conda activate racnet 
  pip install -r requirements.txt # you may need to change the cuda version based on your machine

🗝️ Train and test

Please refer to the configs for train and test seperately.

  python train.py configs/train_RACnet.py
  python test.py configs/test_RACnet.py

Inference on videos directly: see test4video.py.

Citing RACnet

  @inproceedings{luo2024rethinking,
  title={Rethinking temporal self-similarity for repetitive action counting},
  author={Luo, Yanan and Yi, Jinhui and Farha, Yazan Abu and Wolter, Moritz and Gall, Juergen},
  booktitle={2024 IEEE International Conference on Image Processing (ICIP)},
  pages={2187--2193},
  year={2024},
  organization={IEEE}
}

🔒 License

This project is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License. You can view the full license here.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
configs		configs
dataset		dataset
figures		figures
metadata		metadata
LICENSE		LICENSE
README.md		README.md
model.py		model.py
phases.py		phases.py
requirements.txt		requirements.txt
test.py		test.py
test4video.py		test4video.py
tools.py		tools.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RACnet [ICIP 2024]

Rethinking temporal self-similarity for repetitive action counting (ICIP2024)

🚀 News

💡 Introduction

🔍 Main results on three datasets

Getting started

🔧 Data preparation

🔧 Enviroment

🗝️ Train and test

Citing RACnet

🔒 License

About

Releases

Packages

Languages

License

Luoadore/RACnet

Folders and files

Latest commit

History

Repository files navigation

RACnet [ICIP 2024]

Rethinking temporal self-similarity for repetitive action counting (ICIP2024)

🚀 News

💡 Introduction

🔍 Main results on three datasets

Getting started

🔧 Data preparation

🔧 Enviroment

🗝️ Train and test

Citing RACnet

🔒 License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages