Lists (2)
Sort Name ascending (A-Z)
Stars
ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations
Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3
Non-Linear Least Squares Minimization, with flexible Parameter settings, based on scipy.optimize, and with many additional classes and methods for curve fitting.
PyTorch Implementation of TCSinger(EMNLP 2024): Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control
This is an official PyTorch implementation of ASDA (accepted by ACMMM 2024).
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Do Leetcode exercises in IDE, support leetcode.com and leetcode-cn.com, to meet the basic needs of doing exercises.Support theoretically: IntelliJ IDEA PhpStorm WebStorm PyCharm RubyMine AppCode CL…
OpenMusic: SOTA Text-to-music (TTM) Generation
An Open-Sourced LLM-empowered Foundation TTS System
The open source code for SimpleSpeech series
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
LlamaVoice is a llama-based large voice generation model, providing inference and training ability.
C++ Insights - See your source code with the eyes of a compiler
Official mirror of Rubber Band Library, an audio time-stretching and pitch-shifting library.
Vehicle State Estimation using Error-State Extended Kalman Filter
Multichannel State Space Frequency-Domain Adaptive Filtering(MCSSFDAF)
Sourcetrail - free and open-source interactive source explorer
Unofficial pytorch reproduction for the paper "Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction" (arXiv:2401.01498)
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…
Collection of EM algorithms for blind source separation of audio signals
🎙️📝 A powerful Flask-based web application that leverages the latest Hugging Face ASR models to provide real-time speech-to-text (STT) transcripts with an intuitive user interface for easy correcti…