An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.

Python 2,572 198 Updated Apr 14, 2025

maum-ai / voicefilter

Unofficial PyTorch implementation of Google AI's VoiceFilter system

Python 1,134 228 Updated Jul 25, 2024

Ryuk17 / SpeechAlgorithms

You can find the speech algorithms you want here

C 796 248 Updated Jan 1, 2025

grazder / DeepFilterNet

Forked from Rikorose/DeepFilterNet

Noise supression using deep filtering

Python 27 4 Updated May 23, 2024

yuyun2000 / SpeechDenoiser

SpeechDenoiser: Real-Time Speech Denoising with ONNX Welcome to SpeechDenoiser, a simple and effective solution for real-time speech denoising using an ONNX model. This repository contains everythi…

Python 72 12 Updated Aug 16, 2024

manyeyes / AliParaformerAsr

c# library for decoding paraformer, sensevoice Models，used in speech recognition (ASR)

C# 49 4 Updated Aug 23, 2024

wenet-e2e / wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

Python 4,446 1,130 Updated Mar 29, 2025

HaujetZhao / FunASR-Online-Paraformer-Test

Python 47 4 Updated Nov 26, 2023

k2-fsa / sherpa-onnx

Speech-to-text, text-to-speech, speaker diarization, speech enhancement, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS…

C++ 5,606 629 Updated Apr 10, 2025

Michaelszj / pro-tracker

Python 78 14 Updated Jan 14, 2025

siyuanliii / masa

Official Implementation of CVPR24 highlight paper: Matching Anything by Segmenting Anything

Python 1,261 83 Updated Nov 7, 2024

facebookresearch / videoseal

Open and efficient video watermarking

Python 349 42 Updated Apr 4, 2025

xinglunancv / Yolov5-DeepLabV3Plus-MeterReader

使用YOLOv5+DeepLabV3Plus实现仪表的检测、指针表盘分割和刻度读数识别

C++ 44 10 Updated Oct 14, 2021

zhahoi / YoloX-DeepLabV3Plus-MeterReader

使用YoloX+DeepLabV3Plus实现仪表的检测、指针表盘分割和刻度读数识别（借助ncnn框架）

C++ 27 5 Updated Oct 12, 2024

hasanirtiza / Pedestron

[Pedestron] Generalizable Pedestrian Detection: The Elephant In The Room. @ CVPR2021

Python 701 157 Updated Dec 4, 2024

RapidAI / RapidOrientation

文档方向分类

Python 216 15 Updated Nov 20, 2024

RapidAI / RapidUnDistort

修正文档扭曲/模糊/阴影等情况，使用onnx模型简单轻量部署，未来持续跟进最新最好的文档矫正方案和模型,Correct document distortion using a lightweight ONNX model for easy deployment. We will continue to follow and integrate the latest and best docu…

Python 50 8 Updated Dec 15, 2024

memoavatar / memo

Memory-Guided Diffusion for Expressive Talking Video Generation

Python 780 88 Updated Jan 24, 2025

Zheng Li pango99

Lists (32)

3DPose

3D目标检测

ai toy

AIGame

AI绘图

GIBHUB代理

GL_DX_InterOP

Live2D

Mocap

NDI

NERF

TensorRT

Text->Image

tracking

TRT Plugin

UE_Plugin

Unity

VRoid

人脸检测

体育检测

图像变化检测

图像拼接

多目标跟踪

慢动作

手部检测

数据可视化

流媒体

深度估计

视频抠像

视频编解码

语音

超分辨率

Starred repositories

slow-motion

video-frame-interpolation