Stars
📚 C/C++ 技术面试基础知识总结,包括语言、程序库、数据结构、算法、系统、网络、链接装载库等知识及面试经验、招聘、内推等信息。This repository is a summary of the basic knowledge of recruiting job seekers and beginners in the direction of C/C++ technology, in…
Speech-to-text, text-to-speech, speaker diarization, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC…
Effortless data labeling with AI support from Segment Anything and other awesome models.
Implementation of popular deep learning networks with TensorRT network definition API
A cross-platform video structuring (video analysis) framework. If you find it helpful, please give it a star: ) 跨平台的视频结构化(视频分析)框架,觉得有帮助的请给个星星 : )
Research progress on speech deepfake detection: Relevant datasets aggregated from the review literature and publicly available codes
airockchip / yolov5
Forked from ultralytics/yolov5YOLOv5 in PyTorch > ONNX > CoreML > TFLite
SASV2 baseline, a track on ASVspoof5 phase2 challenge
SoftVC VITS Singing Voice Conversion
VITS model pretrained in VCTK and finetuned with male English voice dataset with Indian accent from the Arctic dataset.
Port of OpenAI's Whisper model in C/C++
Speaker anonymization pipeline for hiding the identity of the speaker of a recording by changing the voice in it.
Baseline Recipe for VoicePrivacy Challenge 2022: anonymization systems and evaluation software
Resumes generated using the GitHub informations
Tools for handling speech data in machine learning projects.
Google Research
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
Official implementation of our ASVspoof 2021 paper, "UR Channel-Robust Synthetic Speech Detection System for ASVspoof 2021"
Variational Bayes HMM over x-vectors diarization
SincNet is a neural architecture for efficiently processing raw audio samples.