
Starred repositories
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.
Unofficial PyTorch implementation of Google AI's VoiceFilter system
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translatio…
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
how does voiceprint recognition work in wechat page
视频硬字幕提取,生成srt文件。无需申请第三方API,本地实现文本识别。基于深度学习的视频字幕提取框架,包含字幕区域检测、字幕内容提取。A GUI tool for extracting hard-coded subtitle (hardsub) from videos and generating srt files.
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
The official gpt4free repository | various collection of powerful language models | o3 and deepseek r1, gpt-4.5
Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.
在线批量下载微信公众号文章,支持阅读量、评论、内嵌音视频,无需搭建任何环境,可100%还原文章样式,支持私有部署
Simple Online Realtime Tracking with a Deep Association Metric
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
🎉🎉🎉JAVA高级架构师技术栈==任何技能通过 “刻意练习” 都可以达到融会贯通的境界,就像烹饪一样,这里有一份JAVA开发技术手册,只需要增加自己练习的次数。🏃🏃🏃
Blendshape and kinematics calculator for Mediapipe/Tensorflow.js Face, Eyes, Pose, and Finger tracking models.
A real-time motion capture system for 3D virtual character animating.
OpenMMLab Pose Estimation Toolbox and Benchmark.
TimesFM (Time Series Foundation Model) is a pretrained time-series foundation model developed by Google Research for time-series forecasting.
Sz-Admin:一个开源RBAC中后台框架,专为现代应用设计。它结合了最新的技术栈,包括后端的Spring Boot 3、JDK 21、Mybatis Flex、Sa-Token、Knife4j和Flyway,以及前端的Vue 3、Vite5、TypeScript和Element Plus,致力于为您提供一个直观、流畅且功能强大的开发体验。
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
forest(森林)——一款现代化的知识社区后台项目,使用 SpringBoot + Shiro + MyBatis + JWT + Redis 实现
fuint会员营销系统是一款实体店铺会员管理、积分商城、营销系统。基于Java SpringBoot、Vue、Uniapp,包含前台微信小程序、h5、后台管理收银端。具有优惠券、预存卡、实体卡、集次(计次卡)、储值卡、电子券,会员积分体系,会员等级等营销功能。适合各类实体店铺结合线上电商系统,如:零售超市、汽车4S店、花店、甜品店、餐饮等。本系统可当成收银系统使用,打通了线下收银系统和线上会…
🌮塔可商城, 一个基于springboot3+uniapp+vue3技术栈开发的开源跨平台小程序、管理后台,后端服务的项目,它内置提供了会员分销, 区域代理, 商品零售等功能的新零售电商系统。
企业级 LLM API 快速集成系统,支持OpenAI、Azure、文心一言、讯飞星火、通义千问、智谱GLM、Gemini、DeepSeek、Anthropic Claude以及OpenAI格式的模型等,简洁的页面风格,轻量高效且稳定,支持Docker一键部署。
Starred topics
