-
Institute of Acoustics, Chinese Academy of Sciences
- Beijing
Stars
Stable Diffusion web UI
Robust Speech Recognition via Large-Scale Weak Supervision
《动手学深度学习》:面向中文读者、能运行、可讨论。中英文版被70多个国家的500多所大学用于教学。
A generative speech model for daily dialogue.
SoftVC VITS Singing Voice Conversion
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
GUI for a Vocal Remover that uses Deep Neural Networks.
Fast and memory-efficient exact attention
⚡️HivisionIDPhotos: a lightweight and efficient AI ID photos tools. 一个轻量级的AI证件照制作算法。
StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation
Code for the paper Hybrid Spectrogram and Waveform Source Separation
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…
Flops counter for convolutional networks in pytorch framework
Noise supression using deep filtering
AudioLDM: Generate speech, sound effects, music and beyond, with text.
[IEEE TMI] Official Implementation for UNet++
mobilenetv3 with pytorch,provide pre-train model
Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.
Awesome Lists for Tenure-Track Assistant Professors and PhD students. (助理教授/博士生生存指南)
SincNet is a neural architecture for efficiently processing raw audio samples.
[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)
A wrapper around speech quality metrics MOSNet, BSSEval, STOI, PESQ, SRMR, SISDR
A pytorch quantization backend for optimum
[CVPR 2022] Official implementation of the paper "Uformer: A General U-Shaped Transformer for Image Restoration".