Skip to content

macroustc/w2v2-age-gender-how-to

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

How to use our public age and gender model

An introduction to our model for age and gender prediction based on wav2vec 2.0. The model is available from doi:10.5281/zenodo.7761387 and released under CC BY-NC-SA 4.0. The model was created by fine-tuning the pre-trained wav2vec2-large-robust model on aGender, Mozilla Common Voice, Timit and Voxceleb 2. We provide two variants of the model: one with all 24 transformer layers and a stripped-down version with six transformer layers. The models were exported to ONNX. Further details are given in the associated paper (tba) and notebook.

License

The model can be used for non-commercial purposes, see CC BY-NC-SA 4.0. For commercial usage, a license for devAIce must be obtained. The source code in this GitHub repository is released under the following license.

Quick start

Create / activate Python virtual environment and install audonnx.

$ pip install audonnx

Load the model with six layers and test on random signal.

import audeer
import audonnx
import numpy as np


url = 'https://zenodo.org/record/7761387/files/w2v2-L-robust-6-age-gender.25c844af-1.1.1.zip'
cache_root = audeer.mkdir('cache')
model_root = audeer.mkdir('model')

archive_path = audeer.download_url(url, cache_root, verbose=True)
audeer.extract_archive(archive_path, model_root)
model = audonnx.load(model_root)

sampling_rate = 16000
signal = np.random.normal(size=sampling_rate).astype(np.float32)
model(signal, sampling_rate)
{'hidden_states': array([[ 0.02783544,  0.01402022,  0.03839185, ...,  0.00786646,
         -0.09332313,  0.0915948 ]], dtype=float32),
 'logits_age': array([[0.3961048]], dtype=float32),
 'logits_gender': array([[ 0.32810774, -0.56528044,  0.0317882 ]], dtype=float32)}

The 'hidden_states' are the pooled states of the last transformer layer, 'logits_age' provides scores for age in a range of approximately 0...1 (== 100 years) and 'logits_gender' expresses the confidence for being female, male or child.

Tutorial

For a detailed introduction, please check out the notebook.

$ pip install -r requirements.txt
$ jupyter notebook notebook.ipynb 

Citation

If you use our model in your own work, please cite the following paper (tba)

About

How to use our public wav2vec2 age and gender model

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%