Name		Name	Last commit message	Last commit date
parent directory ..
data		data
src		src
README.md		README.md
prep_sc.py		prep_sc.py
requirements.txt		requirements.txt
run_sc.sh		run_sc.sh

README.md

Meta-Transformer for Audio Understanding

This part of code is for audio data understanidng with Meta-Transfomrer. We conduct experiments on tabular data understanding based on AST. Thanks for their outstanding projects.

Citation

If the code and paper are helpful for your research, please kindly cite:

@article{zhang2023metatransformer,
    title={Meta-Transformer: A Unified Framework for Multimodal Learning}, 
    author={Zhang, Yiyuan and Gong, Kaixiong and Zhang, Kaipeng and Li, Hongsheng and Qiao, Yu and Ouyang, Wanli and Yue, Xiangyu},
    year={2023},
    journal={arXiv preprint arXiv:2307.10802},
}

@inproceedings{gong21b_interspeech,
  author={Yuan Gong and Yu-An Chung and James Glass},
  title={{AST: Audio Spectrogram Transformer}},
  year=2021,
  booktitle={Proc. Interspeech 2021},
  pages={571--575},
  doi={10.21437/Interspeech.2021-698}
}

Usage

1. Environment Setup

We found that the experiment is compatible with point cloud understanding

Please refer to our previous doc

pip install -r requirements.txt

2. Prepare Data

If you don't have the data, the code will download Speech Commands V2 directly.

After taht, we organize the structure of the code as follows:

Audio
├── src
│   ├── models
│   ├── utilities
├── prep_sc.py
│
├── data

3. Train Model

To make the code easier to use, we provide training scripts to train models:

For Speech Commands V2 dataset

bash run_sc.sh

4. Performance of Meta-Transformer

Note that #Param denotes the Trainable parameters ihe the whole network.

Model	Dataset	Acc.	#Param
Meta-Transformer-B16	Speech Commands V2	78.3	1.1M
Meta-Transformer-B16	Speech Commands V2	97.0	86.3M

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Audio

Audio

README.md

Meta-Transformer for Audio Understanding

Citation

Usage

1. Environment Setup

2. Prepare Data

3. Train Model

4. Performance of Meta-Transformer

Files

Audio

Directory actions

More options

Directory actions

More options

Latest commit

History

Audio

Folders and files

parent directory

README.md

Meta-Transformer for Audio Understanding

Citation

Usage

1. Environment Setup

2. Prepare Data

3. Train Model

4. Performance of Meta-Transformer