Skip to content

speaker conditioned voice activity detection

Notifications You must be signed in to change notification settings

WenwanChen/pvad

Repository files navigation

pvad

speaker conditioned voice activity detection replicated from https://arxiv.org/abs/1908.04284

image

Classifier: {non-speech, target speaker, and non-target speaker}

  1. Synthetic dataset generation
    prep4kaldi.sh
    flac_to_wav.sh
    concat.sh concat.py
    augment.py

  2. Prepare target speaker embeddings
    extract_embeddings.py

  3. Extract features and labels
    correct_target_labels.py
    fbank.py
    feature_labels.py

  4. Data loader
    dataloader.py
    dataloader_test.py

  5. Model definition and traning
    pvad_training.py

  6. Saved model
    checkpoint_oct22_coswarm.t7

  7. Test
    test.py

About

speaker conditioned voice activity detection

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published