shuaijiang

Follow

🍉

summer

shuaijiang

🍉

summer

Follow

57 followers · 39 following

PKU
Beijing, China
http://zhaoshuaijiang.com

Achievements

Achievements

Stars

Labbeti / aac-datasets

Audio Captioning datasets for PyTorch.

Python 114 6 Updated Nov 4, 2024

stepfun-ai / Step-Audio

Python 3,433 265 Updated Feb 25, 2025

PKU-Alignment / align-anything

Align Anything: Training All-modality Model with Feedback

Python 2,309 327 Updated Feb 19, 2025

Jiayi-Pan / TinyZero

Clean, minimal, accessible reproduction of DeepSeek R1-Zero

Python 10,672 1,372 Updated Feb 1, 2025

aliyun / alibabacloud-bailian-speech-demo

Sample Repository for the AlibabaCloud Bailian Speech SDK

92 6 Updated Feb 14, 2025

LqNoob / Neural-Codec-and-Speech-Language-Models

Awesome Neural Codec Models, Text-to-Speech Synthesizers & Speech Language Models

Python 108 6 Updated Feb 25, 2025

DennisThink / awesome_twitter_CN

值得关注的中文twitter用户

Python 935 37 Updated Jan 8, 2025

ZJU-LLMs / Foundations-of-LLMs

7,675 652 Updated Jan 14, 2025

voicepaw / so-vits-svc-fork

so-vits-svc fork with realtime support, improved interface and more features.

Python 8,910 1,181 Updated Feb 24, 2025

tzyll / ChineseHP

15 1 Updated Jul 4, 2024

FunAudioLLM / CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 11,067 1,087 Updated Feb 25, 2025

shuaijiang / speech-resynthesis

Forked from facebookresearch/speech-resynthesis

An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.

Python 1 Updated Aug 29, 2023

facebookresearch / speech-resynthesis

An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.

Python 400 57 Updated Aug 29, 2023

MatthewCYM / VoiceBench

VoiceBench: Benchmarking LLM-Based Voice Assistants

Python 122 8 Updated Feb 25, 2025

shuaijiang / Whisper-Finetune

Forked from yeyupiaoling/Whisper-Finetune

Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deplo…

C 234 14 Updated Dec 16, 2024

WeThinkIn / Interview-for-Algorithm-Engineer

【三年面试五年模拟】AI算法工程师面试秘籍。涵盖AIGC、传统深度学习、自动驾驶、机器学习、计算机视觉、自然语言处理、强化学习、具身智能、元宇宙、AGI等AI行业面试笔试经验与干货知识。

1,140 170 Updated Feb 25, 2025

kyutai-labs / moshi

Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.

Python 7,554 607 Updated Feb 25, 2025

bytedance / SALMONN

SALMONN: Speech Audio Language Music Open Neural Network

Python 1,156 91 Updated Dec 12, 2024

karpathy / LLM101n

LLM101n: Let's build a Storyteller

31,964 1,735 Updated Aug 1, 2024

Azure-Samples / cognitive-services-speech-sdk

Sample code for the Microsoft Cognitive Services Speech SDK

C# 3,075 1,904 Updated Feb 21, 2025

RUCAIBox / LLMSurvey

The official GitHub page for the survey paper "A Survey of Large Language Models".

Python 11,062 861 Updated Aug 20, 2024

LSimon95 / megatts2

Unoffical implementation of Megatts2

Python 276 36 Updated Mar 23, 2024

serp-ai / bark-with-voice-clone

Forked from suno-ai/bark

🔊 Text-prompted Generative Audio Model - With the ability to clone voices

Jupyter Notebook 3,247 434 Updated Jun 12, 2024

huggingface / dataspeech

Python 348 53 Updated Sep 3, 2024

shuaijiang / ke-data-juicer

Forked from modelscope/data-juicer

A one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大语言模型提供更高质量、更丰富、更易”消化“的数据！

Python 1 Updated Mar 13, 2024

X-PLUG / MobileAgent

Mobile-Agent: The Powerful Mobile Device Operation Assistant Family

Python 3,483 333 Updated Feb 21, 2025

modelscope / data-juicer

Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷

Python 3,737 211 Updated Feb 25, 2025

declare-lab / MELD

MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversation

Python 870 211 Updated Mar 10, 2024

huggingface / distil-whisper

Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.

Python 3,749 311 Updated Jan 8, 2025

jiaaro / pydub

Manipulate audio with a simple and easy high level interface

Python 9,194 1,073 Updated Jul 25, 2024