-
University of Science and Technology of China/Shengshu AI
- Beijing
Stars
Unified automatic quality assessment for speech, music, and sound.
PyTorch Implementation of AudioLCM (ACM-MM'24): a efficient and high-quality text-to-audio generation with latent consistency model.
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.
ivcylc / Inf-DiT
Forked from THUDM/Inf-DiTOfficial implementation of Inf-DiT: Upsampling Any-Resolution Image with Memory-Efficient Diffusion Transformer
ivcylc / Co-DETR
Forked from Sense-X/Co-DETR[ICCV 2023] DETRs with Collaborative Hybrid Assignments Training
ivcylc / WaveWizard
Forked from JackVinati/WaveWizardA Gradio app for analyzing audio files to determine true sample rate and bit depth.
SwissArmyTransformer is a flexible and powerful library to develop your own Transformer variants.
OpenMusic: SOTA Text-to-music (TTM) Generation
Refactored / updated version of `stable-audio-tools` which is an open-source code for audio/music generative models originally by Stability AI.
Generative models for conditional audio generation
AudioLDM training, finetuning, evaluation and inference.
AudioLDM: Generate speech, sound effects, music and beyond, with text.
Code for paper "Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System"
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
Easy-to-use and high-performance NLP and LLM framework based on MindSpore, compatible with models and datasets of 🤗Huggingface.