SelM [Paper]

SelM: Selective Mechanism based Audio-Visual Segmentation

**Jiaxu Li^†, Songsong Yu^†, Yifan Wang*, Lijun Wang, Huchuan Lu**

This repository contains code for "SelM: Selective Mechanism based Audio-Visual Segmentation" (ACM MM 2024 Oral, 3.97%).

IIAU Lab @ Dalian University of Technology

^†equal contribution

Overview

Environment Prepare

Our Code was tested upon a conda environment.

You can install conda by this link Conda and then create an environment as follows:

conda create -n selm python=3.9 conda activate selm

We use Pytorch 2.0.1 with CUDA-11.7 as our default setting, install Pytorch by pip as below

pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2

Notice : Mamba-ssm Link require CUDA 11.6+ , you might have to update your CUDA.

for other required packages:

pip install -r requirements.txt

Dataset and Pretrained Backbone

For AVSBench Dataset ,please refer to this link AVSBench to download the datasets

For Pretrained Backbone(ResNet50、PVT-v2、VGGish),please refer to this link to download.

You can place the dataset and pretrained backbone to the directory data pretrained backbone Notice : Don't forget to change the paths of data and model in config.py

Pretrained Model

You can download our pretrained SelM models by Google Drive and place it to the directory pretrained model

Method Subset mIoU F-score Download

SelM-R50 S4 76.6 86.2 pth

SelM-PVTv2 S4 83.5 91.2 pth

SelM-R50 MS3 54.5 65.6 pth

SelM-PVTv2 MS3 60.3 71.3 pth

SelM-R50 AVSS 31.9 37.2 pth

SelM-PVTv2 AVSS 41.3 46.9 pth

Train

For S4 and MS3 settings, we supply single-gpu train, run the command below :

#S4 cd avs_s4 bash train.sh #MS3 cd avs_ms3 bash train.sh

Note that for AVSS setting, we supply muti-gpu train, to train SelM on 8 GPUs run:

cd avss bash train.sh

Test

For test, remember to change the path of weights ,run:

#S4 cd avs_s4 bash test.sh #MS3 cd avs_ms3 bash test.sh #AVSS cd avss bash test.sh

Acknowledgement

This repo is based on AVSBench,RIS-DMMI,CGFormer,many thanks to these wonderful works.

Citation

If you are interested in our work, you can cite our work by below bibtex, thank you !

@inproceedings{li2024selm, title={SelM: Selective Mechanism based Audio-Visual Segmentation}, author={Li, Jiaxu and Yu, Songsong and Wang, Yifan and Wang, Lijun and Lu, Huchuan}, booktitle={Proceedings of the 32nd ACM International Conference on Multimedia}, pages={3926--3935}, year={2024} }

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SelM [Paper]

SelM: Selective Mechanism based Audio-Visual Segmentation

**Jiaxu Li^†, Songsong Yu^†, Yifan Wang*, Lijun Wang, Huchuan Lu**

Overview

Environment Prepare

Dataset and Pretrained Backbone

Pretrained Model

Train

Test

Acknowledgement

Citation

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
avs_ms3		avs_ms3
avs_s4		avs_s4
avss		avss
data		data
images		images
pretrained backbone		pretrained backbone
pretrained model		pretrained model
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Method	Subset	mIoU	F-score	Download
SelM-R50	S4	76.6	86.2	pth
SelM-PVTv2	S4	83.5	91.2	pth
SelM-R50	MS3	54.5	65.6	pth
SelM-PVTv2	MS3	60.3	71.3	pth
SelM-R50	AVSS	31.9	37.2	pth
SelM-PVTv2	AVSS	41.3	46.9	pth

License

Cyyzpoi/SelM

Folders and files

Latest commit

History

Repository files navigation

SelM [Paper]

SelM: Selective Mechanism based Audio-Visual Segmentation

Jiaxu Li†, Songsong Yu†, Yifan Wang*, Lijun Wang, Huchuan Lu

Overview

Environment Prepare

Dataset and Pretrained Backbone

Pretrained Model

Train

Test

Acknowledgement

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

**Jiaxu Li^†, Songsong Yu^†, Yifan Wang*, Lijun Wang, Huchuan Lu**

Packages