Skip to content
View Liujingxiu23's full-sized avatar

Block or report Liujingxiu23

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Text-to-Music Generation with Rectified Flow Transformers

Python 1,538 116 Updated Sep 6, 2024

VITS with phoneme-level prosody modeling based on MaskGIT

Python 74 7 Updated Aug 31, 2024

Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model

Python 89 3 Updated Oct 1, 2024

Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI

Python 614 23 Updated Oct 7, 2024

SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling

Python 686 39 Updated Sep 21, 2024

The open source code for SimpleSpeech series

Python 90 6 Updated Aug 19, 2024

Evaluation Protocol for Large-Scale Zero-Shot TTS Literature

Python 49 7 Updated Sep 26, 2024

LlamaVoice is a llama-based large voice generation model, providing inference and training ability.

Python 214 11 Updated Aug 26, 2024

lina-speech : linear attention based text-to-speech

Jupyter Notebook 116 9 Updated Jun 3, 2024

A fast speech-to-any translation model that supports simultaneous decoding and offers 28× speedup.

Python 60 4 Updated Aug 12, 2024

Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)

Python 232 19 Updated Oct 8, 2024

official code for CVPR'24 paper Diff-BGM

Python 40 3 Updated Mar 28, 2024

✨✨Latest Advances on Multimodal Large Language Models

12,046 769 Updated Oct 7, 2024

Inference and training library for high-quality TTS models.

Python 4,319 436 Updated Sep 23, 2024

GPT-style network for phonemization with durations of text

Jupyter Notebook 62 9 Updated Mar 21, 2024

🔊 Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset types. 🎧👥📊 Advanced audio processing.

Python 197 20 Updated Jun 10, 2024

Official implementation of "Separate Anything You Describe"

Python 1,587 115 Updated Mar 31, 2024

vits2 backbone with multilingual-bert

Python 7,862 1,114 Updated Oct 7, 2024

List of speech synthesis papers.

993 120 Updated Jul 24, 2023

Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement

Python 293 44 Updated Sep 13, 2024

MP-SENet: A Speech Enhancement Model with Parallel Denoising of Magnitude and Phase Spectra

Python 1 Updated Dec 24, 2023

AI Audio Datasets (AI-ADS) 🎵, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio a…

476 33 Updated Sep 6, 2024

This is a list of datasets consisting of speech, music, and sound effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio …

1 Updated Nov 8, 2023

LLMs interview notes and answers:该仓库主要记录大模型(LLMs)算法工程师相关的面试题和参考答案

1,149 255 Updated Dec 14, 2023

A simple and open-source analogue of the HeyGen system

Python 881 178 Updated Aug 1, 2024

text to speech using autoregressive transformer and VITS

Python 224 15 Updated Apr 3, 2024
Python 253 35 Updated May 15, 2023
Next