Stars
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
real time face swap and one-click video deepfake with only a single image
bigfootcn / WeChatMsg-
Forked from LC044/WeChatMsg提取微信聊天记录,将其导出成HTML、Word、Excel文档永久保存,对聊天记录进行分析生成年度聊天报告,用聊天数据训练专属于个人的AI聊天助手
An open-source AI content search engine designed specifically for content creators. Supports extraction of text, images, and short videos. Allows full local deployment (web app, RAG server, LLM ser…
A generative speech model for daily dialogue.
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。
Incredibly fast Whisper-large-v3
Curated list of project-based tutorials
Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言,同时支持语音识别转录、语音合成、字幕翻译。
[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
Official Code for DragGAN (SIGGRAPH 2023)
视频音频生成字幕,生成srt文件。无需申请第三方API,本地实现音频转文本。基于Transformer的视频字幕生成框架。A GUI tool for generating subtitle from videos and generating srt files.
A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。
🤖️ Cross-platform AI language practice app (跨平台AI语言练习应用)
Ecoute is a live transcription tool that provides real-time transcripts for both the user's microphone input (You) and the user's speakers output (Speaker) in a textbox. It also generates a suggest…
🔉 Youtube Videos Transcription with OpenAI's Whisper
🔉 Youtube Videos Transcription with OpenAI's Whisper
serp-ai / bark-with-voice-clone
Forked from suno-ai/bark🔊 Text-prompted Generative Audio Model - With the ability to clone voices
so-vits-svc fork with realtime support, improved interface and more features.
SoftVC VITS Singing Voice Conversion
Repo for BenTsao [original name: HuaTuo (华驼)], Instruction-tuning Large Language Models with Chinese Medical Knowledge. 本草(原名:华驼)模型仓库,基于中文医学知识的大语言模型指令微调
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
An all in one solution for adding Temporal Stability to a Stable Diffusion Render via an automatic1111 extension