Highlights
- Pro
Stars
(NeurIPS 2024 Oral 🔥) Improved Distribution Matching Distillation for Fast Image Synthesis
[ICLR 2024] Continual Momentum Filtering on Parameter Space for Online Test-time Adaptation.
An unofficial PyTorch implementation of the StreamVC(Real-Time Low-Latency Voice Conversion)
Generative models for conditional audio generation
Pitch Estimating Neural Networks (PENN)
Inference and training library for high-quality TTS models.
A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.
This is the official train-dev-test release of the Interspeech2024 Discrete Speech Representation Challenge.
PyTorch implementation of the Perceptual Evaluation of Speech Quality for wideband audio
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…
Code and dataset for photorealistic Codec Avatars driven from audio
Official codes and models of the paper "Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generation"
pdfrx is yet another PDF viewer implementation that built on the top of PDFium. The plugin currently supports Android, iOS, Windows, macOS, Linux, and Web.
Versatile audio super resolution (any -> 48kHz) with AudioSR.
An Open-source Streaming High-fidelity Neural Audio Codec
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…
speech self-supervised representations