Stars
π Text-Prompted Generative Audio Model
Defending against Adversarial Audio via Diffusion Model (ICLR 2023)
fine-tune Whipser model for Taiwanese speech recognition
Replace arXiv links by their corresponding bibliography in markdowns / Notion database
haloha123 / faster-whisper
Forked from SYSTRAN/faster-whisperFaster Whisper transcription with CTranslate2
Transcribe a Collection of Waveform Audio Files using whisper_timestamped
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
A not very efficient attempt to create a real time openai/whisper (Audio to Text Transcriber)
Phoneme Recognition using pre-trained models Wav2vec2, HuBERT and WavLM. Throughout this project, we compared specifically three different self-supervised models, Wav2vec (2019, 2020), HuBERT (2021β¦
π Paper reading list in conversational AI (constantly updating π€).
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
Implementation of Bit Diffusion, Hinton's group's attempt at discrete denoising diffusion, in Pytorch
chinese speech pretrained models
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Transcribing Speech with Multinomial Diffusion, training code and models.
Revisiting Denoising Diffusion Probabilistic Models for Speech Enhancement: Condition Collapse, Efficiency and Refinement, Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI), 2023.
PyTorch implementation of the Perceptual Evaluation of Speech Quality for wideband audio
Trainer for audio-diffusion-pytorch
Code for a paper exploring using diffusion models to defend neural networks against adversarial attacks
Domain adaptation made easy. Fully featured, modular, and customizable.