This repository contains the official implementation of Scene-aware Probabilistic Masking and Fusion for Video Anomaly Detection.
- Scene-aware Probabilistic Masking and Fusion for Video Anomaly Detection is under peer review.
To enhance visual feature extraction, we propose the Surveillance Video Masked Autoencoder (SVMAE) framework. It uses scene-aware probabilistic masking and perspective reconstruction loss for efficient pre-training. Additionally, a dual encoder architecture with scene-aware token fusion is proposed for video anomaly detection.
The framework is depicted below:
- You can download from here.
- For UCF-Crime dataset, put the generated/downloaded features under
./save/Crime
folder. Other datasets follow the same structure. - For UCF-Crime dataset, change the path of visual features in
./list/ucf-videoMae-CLIP-L_UCF_9-5_9-1_finetune_dif_0.5_SP_norm_a0.05_fast.list
andlist/ucf-videoMae-test-CLIP-L_UCF_9-5_9-1_finetune_dif_0.5_SP_norm_a0.05_fast.list
. Other datasets follow the same structure.
Run pip install -r requirement.txt
to install the requirements.
!!!VERY IMPORTANT!!!
Open a separate terminal and run visdom
after installing the requirements before running the following commands.
Meanings of the arguments can be seen in option.py
. To train the best model presented in the paper, use the following settings:
UCF-Crime dataset
Training
bash run.sh
Testing only
bash run_test.sh
More are coming soon!
This code is based on VideoMAE, TEVAD and UR-DMU. We thank the authors for their great work.