MaxMax2016

MaxMax MaxMax2016

Deep Learning Beginners

347 followers · 896 following

Achievements

moshi Public
Forked from kyutai-labs/moshi

类似GPT4O，语音端到端交互

Python Apache License 2.0 Updated Sep 19, 2024
seed-vc Public
Forked from Plachtaa/seed-vc

seed-tts: zero-shot voice conversion with in context learning

Python MIT License Updated Sep 14, 2024
stable-speech Public
Forked from huggingface/parler-tts

Reproduction of Stability AI's Text-to-Speech model.

Python Apache License 2.0 Updated Sep 14, 2024
speech-trident Public
Forked from ga642381/speech-trident

Awesome speech/audio LLMs, representation learning, and codec models

2 Updated Sep 14, 2024
e2-tts-pytorch Public
Forked from lucidrains/e2-tts-pytorch

Flow-matching Transformer，Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch

Python MIT License Updated Sep 10, 2024
FireRedTTS Public
Forked from FireRedTeam/FireRedTTS

小红书语音合成大模型

Updated Sep 6, 2024
HierSpeechpp Public
Forked from sh-lee-prml/HierSpeechpp

The official implementation of HierSpeech++

Python MIT License Updated Sep 5, 2024
TTS-arxiv-daily Public
Forked from liutaocode/TTS-arxiv-daily

Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)

Python Apache License 2.0 Updated Aug 22, 2024
REAL_TIME_NKF_AEC Public
Forked from William1617/REAL_TIME_NKF_AEC

神经网络回声消除，C实现

C++ 1 Updated Jul 29, 2024
FasterLivePortrait Public
Forked from warmshao/FasterLivePortrait

Bring portraits to life in Real Time！onnx/tensorrt support！

Python Updated Jul 25, 2024
gryannote Public
Forked from clement-pages/gryannote

说话人识别，Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.

Svelte MIT License Updated Jul 19, 2024
SysMocap Public
Forked from xianfei/SysMocap

数字人动捕和驱动完整方案 A real-time motion capture system for 3D virtual character animating.

JavaScript Mozilla Public License 2.0 Updated Jul 18, 2024
SenseVoice-onnx Public
Forked from lovemefan/SenseVoice-python

sensevoice with onnx runtime

Python Updated Jul 18, 2024
Qwen2-Audio Public
Forked from QwenLM/Qwen2-Audio

The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.

Updated Jul 16, 2024
tinyspeech Public
Forked from AkshathRaghav/tinyspeech

Code release for "TinySpeech: Attention Condensers for Deep Speech Recognition Neural Networks on Edge Devices"

Python MIT License Updated Jul 15, 2024
Diff-MST Public
Forked from sai-soum/Diff-MST

音乐生成，Multitrack music mixing style transfer given a reference song using differentiable mixing console.

Jupyter Notebook Other Updated Jul 11, 2024
optispeech Public
Forked from mush42/optispeech

TTS, A lightweight end-to-end text-to-speech model

Python MIT License Updated Jul 10, 2024
BigVGAN-Official Public
Forked from NVIDIA/BigVGAN

终于开源了 Official implementation of BigVGAN in PyTorch

Python MIT License Updated Jul 10, 2024
silero-vad Public
Forked from snakers4/silero-vad

大数据训练的VAD Silero VAD: pre-trained enterprise-grade Voice Activity Detector, Language Classifier and Spoken Number Detector

Python MIT License Updated Jul 10, 2024
MaxKB Public
Forked from 1Panel-dev/MaxKB

RAG 🚀 基于 LLM 大语言模型的知识库问答系统。开箱即用、模型中立、灵活编排，支持快速嵌入到第三方业务系统，1Panel 官方出品。

Python GNU General Public License v3.0 Updated Jul 9, 2024
MARS5-TTS Public
Forked from Camb-ai/MARS5-TTS

MARS5 speech model (TTS) from CAMB.AI

Python GNU Affero General Public License v3.0 Updated Jul 8, 2024
LivePortrait Public
Forked from KwaiVGI/LivePortrait

头像动作迁移，Make one portrait alive!

Python MIT License Updated Jul 8, 2024
CosyVoice Public
Forked from FunAudioLLM/CosyVoice

LLM based TTS model, providing inference/training/deployment full-stack ability.

Python 1 Apache License 2.0 Updated Jul 8, 2024
promonet Public
Forked from maxrmorrison/promonet

语音编辑，Prosody and Pronunciation Modification Network

Python MIT License Updated Jul 7, 2024
SenseVoice Public
Forked from FunAudioLLM/SenseVoice

语音克隆数据清洗必备，Multilingual Voice Understanding Model

Python 1 MIT License Updated Jul 4, 2024
StreamingHiFiGAN Public
Forked from facebookresearch/AudioDec

An Open-source Streaming High-fidelity Neural Audio Codec

Python 10 Other Updated Jul 4, 2024
DEX-TTS Public
Forked from winddori2002/DEX-TTS

DEX-TTS: Diffusion-based EXpressive TTS with Style Modeling on Time Variability

Python 3 MIT License Updated Jul 3, 2024
noise-reduction Public
Forked from dengcunqin/noise-reduction

noise reduction

Python Updated Jul 3, 2024
faster-whisper Public
Forked from SYSTRAN/faster-whisper

Faster Whisper transcription with CTranslate2

Python MIT License Updated Jul 2, 2024
BetterFastSpeech2 Public
Forked from shivammehta25/BetterFastSpeech2

代码美化重构版，FastSpeech2

Jupyter Notebook MIT License Updated Jul 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MaxMax MaxMax2016

Achievements

Achievements

Block or report MaxMax2016

moshi Public

seed-vc Public

stable-speech Public

speech-trident Public

e2-tts-pytorch Public

FireRedTTS Public

HierSpeechpp Public

TTS-arxiv-daily Public

REAL_TIME_NKF_AEC Public

FasterLivePortrait Public

gryannote Public

SysMocap Public

SenseVoice-onnx Public

Qwen2-Audio Public

tinyspeech Public

Diff-MST Public

optispeech Public

BigVGAN-Official Public

silero-vad Public

MaxKB Public

MARS5-TTS Public

LivePortrait Public

CosyVoice Public

promonet Public

SenseVoice Public

StreamingHiFiGAN Public

DEX-TTS Public

noise-reduction Public

faster-whisper Public

BetterFastSpeech2 Public