-
Institute of Acoustics, Chinese Academy of Sciences
- Beijing
Stars
This is the PyTorch implementation of the Universal Source Separation with Weakly labelled Data.
⚡️HivisionIDPhotos: a lightweight and efficient AI ID photos tools. 一个轻量级的AI证件照制作算法。
Music repair method to convert lossy MP3 compressed music to lossless music.
[Interspeech 2024] Hold Me Tight: Stable Encoder-Decoder Design for Speech Enhancement
Robust Speech Recognition via Large-Scale Weak Supervision
This repo hosts the code and models of "Masked Autoencoders that Listen".
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…
a python library for speech enhancement
The official implementation of GTCRN, an ultra-lite speech enhancement model.
A generative speech model for daily dialogue.
A pytorch quantization backend for optimum
Stable Diffusion web UI
StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation
This is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.
A pytorch model profiler with information about macs, energy and e.t.c
Deep Xi: A deep learning approach to a priori SNR estimation implemented in TensorFlow 2/Keras. For speech enhancement and robust ASR.
Production first, nn-based on-device signal processing toolkit.
GUI for a Vocal Remover that uses Deep Neural Networks.
SoftVC VITS Singing Voice Conversion
AudioLDM: Generate speech, sound effects, music and beyond, with text.
Apply diffusion models using the new Hugging Face diffusers package to synthesize music instead of images.