-
Undergrad at USTC. Intern at ShengShu
- Beijing
- https://scholar.google.com/citations?user=uDk9qSMAAAAJ
- https://www.zhihu.com/people/shen-qi-de-ai-zi-76
Lists (3)
Sort Name ascending (A-Z)
Stars
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
Generative models for conditional audio generation
Versatile audio super resolution (any -> 48kHz) with AudioSR.
Shortcut flow matching Pytorch implementation
The official Implementation of PeriodWave and PeriodWave-Turbo
Unified automatic quality assessment for speech, music, and sound.
easy-to-use implementation of the ISMIR 2013 Audio Degradation Toolbox
[CVPR 2025 Oral] Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models
Janus-Series: Unified Multimodal Understanding and Generation Models
📖 This is a repository for organizing papers, codes and other resources related to unified multimodal models.
Music repair method to convert lossy MP3 compressed music to lossless music.
[ICASSP 2025] "FLowHigh: Towards efficient and high-quality audio super-resolution with single-step flow matching"
[CVPR'25] Official Implementations for Paper - MagicQuill: An Intelligent Interactive Image Editing System
Official PyTorch implementation of BigVGAN (ICLR 2023)
Official implementation of "AEROMamba: An efficient architecture for audio super-resolution using generative adversarial networks and state space models", presented in LAMIR 2024 Workshop
Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch
🔥🔥🔥A curated list of papers on recent diffusion-based high-resolution image and video synthesis works.