-
-
FireRedTTS Public
Forked from FireRedTeam/FireRedTTSAn Open-Sourced LLM-empowered Foundation TTS System
Python UpdatedSep 25, 2024 -
Draw-an-Audio-Code Public
Forked from yannqi/Draw-an-Audio-CodeOfficial code of the paper: Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis.
Apache License 2.0 UpdatedSep 11, 2024 -
S3Tokenizer Public
Forked from xingchensong/S3TokenizerReverse Engineering of Supervised Semantic Speech Tokenizer (S3Tokenizer) proposed in CosyVoice
Python Apache License 2.0 UpdatedSep 10, 2024 -
mini-omni Public
Forked from gpt-omni/mini-omniopen-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
Python MIT License UpdatedSep 9, 2024 -
punctuator Public
Forked from FerdinandZhong/punctuatorA small seq2seq punctuator tool based on DistilBERT
Python Apache License 2.0 UpdatedSep 8, 2024 -
Deep-Live-Cam Public
Forked from hacksider/Deep-Live-Camreal time face swap and one-click video deepfake with only a single image
Python GNU Affero General Public License v3.0 UpdatedAug 21, 2024 -
SenseVoice Public
Forked from FunAudioLLM/SenseVoiceMultilingual Voice Understanding Model
Python MIT License UpdatedJul 5, 2024 -
CosyVoice Public
Forked from FunAudioLLM/CosyVoiceLLM based TTS model, providing inference/training/deployment full-stack ability.
Python Apache License 2.0 UpdatedJul 5, 2024 -
yt-dlp Public
Forked from yt-dlp/yt-dlpA feature-rich command-line audio/video downloader
Python The Unlicense UpdatedJun 17, 2024 -
g2pW Public
Forked from GitYCC/g2pWChinese Mandarin Grapheme-to-Phoneme Converter. 中文轉注音或拼音 (INTERSPEECH 2022)
Python Apache License 2.0 UpdatedJun 16, 2024 -
-
Awesome-Talking-Face Public
Forked from JosephPai/Awesome-Talking-Face📖 A curated list of resources dedicated to talking face.
MIT License UpdatedJun 4, 2024 -
SLAM-LLM Public
Forked from X-LANCE/SLAM-LLMSpeech, Language, Audio, Music Processing with Large Language Model
Python MIT License UpdatedJun 2, 2024 -
Diff-Foley Public
Forked from luosiallen/Diff-FoleyDiff-Foley: Synchronized Video-to-Audio Synthesis with Latent Diffusion Models
Python Apache License 2.0 UpdatedMay 29, 2024 -
ChatTTS Public
Forked from 2noise/ChatTTSChatTTS is a generative speech model for daily dialogue.
Jupyter Notebook Other UpdatedMay 29, 2024 -
Codecfake Public
Forked from xieyuankun/CodecfakeThis is the official repo of our work titled "The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio".
Python UpdatedMay 16, 2024 -
diffusers Public
Forked from huggingface/diffusers🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
Python Apache License 2.0 UpdatedApr 19, 2024 -
VoiceCraft Public
Forked from jasonppy/VoiceCraftZero-Shot Speech Editing and Text-to-Speech in the Wild
Python Other UpdatedMar 25, 2024 -
Open-Sora Public
Forked from hpcaitech/Open-SoraBuilding your own video generation model like OpenAI's Sora
Python Apache License 2.0 UpdatedMar 6, 2024 -
minisora Public
Forked from mini-sora/minisoraThe Mini Sora project aims to explore the implementation path and future development direction of Sora.
Python Apache License 2.0 UpdatedMar 4, 2024 -
Open-Sora-Plan Public
Forked from PKU-YuanGroup/Open-Sora-PlanThis project aim to reproducing Sora (Open AI T2V model), but we only have limited resource. We deeply wish the all open source community can contribute to this project.
Jupyter Notebook Other UpdatedMar 4, 2024 -
DeepLearningSystem Public
Forked from chenzomi12/AISystemDeep Learning System core principles introduction.
Jupyter Notebook Apache License 2.0 UpdatedMar 3, 2024 -
Awesome-Video-Diffusion-Models Public
Forked from ChenHsing/Awesome-Video-Diffusion-Models[Arxiv] A Survey on Video Diffusion Models
UpdatedMar 2, 2024 -
Awesome-Text-to-Image Public
Forked from Yutong-Zhou-cv/Awesome-Text-to-Image(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.
MIT License UpdatedMar 1, 2024 -
jepa Public
Forked from facebookresearch/jepaPyTorch code and models for V-JEPA self-supervised learning from video.
Python Other UpdatedFeb 20, 2024 -
llm-paper-daily Public
Forked from xianshang33/llm-paper-dailyDaily updated LLM papers. 每日更新 LLM 相关的论文,欢迎订阅 👏 喜欢的话动动你的小手 🌟 一个
UpdatedJan 26, 2024 -
Qwen-VL Public
Forked from QwenLM/Qwen-VLThe official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
Python Other UpdatedJan 22, 2024 -
GPT-SoVITS Public
Forked from RVC-Boss/GPT-SoVITS1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Python MIT License UpdatedJan 17, 2024 -
fish-speech Public
Forked from fishaudio/fish-speechBrand new TTS solution
Python BSD 3-Clause "New" or "Revised" License UpdatedJan 15, 2024