Skip to content
/ SelM Public

[ACM MM2024 Oral] SelM: Selective Mechanism based Audio-Visual Segmentation

License

Notifications You must be signed in to change notification settings

Cyyzpoi/SelM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

51 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SelM [Paper]

SelM: Selective Mechanism based Audio-Visual Segmentation

Jiaxu Li, Songsong Yu, Yifan Wang*, Lijun Wang, Huchuan Lu

This repository contains code for "SelM: Selective Mechanism based Audio-Visual Segmentation" (ACM MM 2024 Oral, 3.97%).

IIAU Lab @ Dalian University of Technology

equal contribution

Overview

Overview

Environment Prepare

Our Code was tested upon a conda environment.

You can install conda by this link Conda and then create an environment as follows:

conda create -n selm python=3.9 

conda activate selm

We use Pytorch 2.0.1 with CUDA-11.7 as our default setting, install Pytorch by pip as below

pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2

Notice : Mamba-ssm Link require CUDA 11.6+ , you might have to update your CUDA.

for other required packages:

pip install -r requirements.txt

Dataset and Pretrained Backbone

For AVSBench Dataset ,please refer to this link AVSBench to download the datasets

For Pretrained Backbone(ResNet50、PVT-v2、VGGish),please refer to this link to download.

You can place the dataset and pretrained backbone to the directory data pretrained backbone Notice : Don't forget to change the paths of data and model in config.py

Pretrained Model

You can download our pretrained SelM models by Google Drive and place it to the directory pretrained model

Method Subset mIoU F-score Download
SelM-R50 S4 76.6 86.2 pth
SelM-PVTv2 S4 83.5 91.2 pth
SelM-R50 MS3 54.5 65.6 pth
SelM-PVTv2 MS3 60.3 71.3 pth
SelM-R50 AVSS 31.9 37.2 pth
SelM-PVTv2 AVSS 41.3 46.9 pth

Train

For S4 and MS3 settings, we supply single-gpu train, run the command below :

#S4
cd avs_s4
bash train.sh

#MS3
cd avs_ms3
bash train.sh

Note that for AVSS setting, we supply muti-gpu train, to train SelM on 8 GPUs run:

cd avss
bash train.sh

Test

For test, remember to change the path of weights ,run:

#S4
cd avs_s4
bash test.sh

#MS3
cd avs_ms3
bash test.sh

#AVSS
cd avss
bash test.sh

Acknowledgement

This repo is based on AVSBench,RIS-DMMI,CGFormer,many thanks to these wonderful works.

Citation

If you are interested in our work, you can cite our work by below bibtex, thank you !

@inproceedings{li2024selm,
  title={SelM: Selective Mechanism based Audio-Visual Segmentation},
  author={Li, Jiaxu and Yu, Songsong and Wang, Yifan and Wang, Lijun and Lu, Huchuan},
  booktitle={Proceedings of the 32nd ACM International Conference on Multimedia},
  pages={3926--3935},
  year={2024}
}

About

[ACM MM2024 Oral] SelM: Selective Mechanism based Audio-Visual Segmentation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published