Stars
PyTorch implementation for Score-Based Generative Modeling through Stochastic Differential Equations (ICLR 2021, Oral)
Official code for Score-Based Generative Modeling through Stochastic Differential Equations (ICLR 2021, Oral)
Official repository of SepReformer for speech separation
Deep Attractor Network (DANet) for single-channel speech separation
[IJCAI 2024] EAT: Self-Supervised Pre-Training with Efficient Audio Transformer
PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html
Evaluation Metrics Used For The Performance Evaluation of Voice Conversion (VC) Models
A python package to analyze and compare voices with deep learning
[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation
PyTorch implementation of image classification models for MNIST
Pytorch、Scikit-learn实现多种分类方法,包括逻辑回归(Logistic Regression)、多层感知机(MLP)、支持向量机(SVM)、K近邻(KNN)、CNN、RNN,极简代码适合新手小白入门,附英文实验报告(ACM模板)
🔉 👦 👧 👩 👨 Speaker identification using voice MFCCs and GMM
It uses GMM to train a speaker identification model. The training and testing has been done on subset (34 speakers) from VoxForge data corpus.
QuickVC: Any-to-many Voice Conversion Using Inverse Short-time Fourier Transform for Faster Conversion
AI-Generated Presets for Faithful 4K Color Style Transfer in Real Time [CVPR 2023]
Voice Converter Using CycleGAN and Non-Parallel Data
《Hello 算法》:动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新,English version ongoing
A Transformer-based joint-encoding for Emotion Recognition and Sentiment Analysis
A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, …
PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation
Incorporating the memory mechanism into the transformer and employing a parallel weighting structure to obtain a better utterance-level representation on the speaker verification task
LaneDetection bot using canny edge detection, hough Transform, PID control in Gazebo using ROS
A c++ implementation of yolov5 and deepsort
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the …
Python program to steganography files into images using the Least Significant Bit.