Stars
[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching
AudioBench: A Universal Benchmark for Audio Large Language Models
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…
An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.
Code and documentation to train Stanford's Alpaca models, and generate the data.
Official repository of SepReformer for speech separation
Official codes and models of the paper "Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generation"
ASCII generator (image to text, image to image, video to video)
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
SCOREQ: Speech COntrastive REgression for Quality Assessment (NeurIPS 2024)
This github repo is for Neurips 2021 and Interspeech 2022 papers on Non-Matching Reference based estimation of speech quality assessment.
This repo hosts the code and model of "Separate What You Describe: Language-Queried Audio Source Separation", Interspeech 2022
The project uses Python to implement the PointNet training process, while leveraging GPU acceleration, C++, and CUDA for efficient inference.
Score-based Generative Models (Diffusion Models) for Speech Enhancement and Dereverberation
BLSP: Bootstrapping Langauge-Speech Pre-training via Behavior Alignment of Continuation Writing
The implementation of "X-TF-GridNet: A Time-Frequency Domain Target Speaker Extraction Network with Adaptive Speaker Embedding Fusion", which is accepted by Information Fusion.
✨✨Latest Advances on Multimodal Large Language Models
✨✨VITA: Towards Open-Source Interactive Omni Multimodal LLM
Analytic Class Incremental Learning for Sound Source Localization with Privacy Protection
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
PointNet and PointNet++ implemented by pytorch (pure python) and on ModelNet, ShapeNet and S3DIS.
PyTorch implementation for Score-Based Generative Modeling through Stochastic Differential Equations (ICLR 2021, Oral)