-
Northwestern Polytechnical University
- Suzhou
-
01:19
(UTC +08:00)
Lists (12)
Sort Name ascending (A-Z)
Stars
Awesome Neural Codec Models, Text-to-Speech Synthesizers & Speech Language Models
TTSAudioNormalizer is a specialized tool for TTS data production, featuring descriptive statistical analysis of audio loudness and loudness normalization operations.
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
Collection of awesome LLM apps with RAG using OpenAI, Anthropic, Gemini and opensource models.
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
Implementation of the CVPR 2019 Paper - Speech2Face: Learning the Face Behind a Voice by MIT CSAIL
Audio-Visual Speech Separation with Cross-Modal Consistency
A family of state-of-the-art Transformer-based audio codecs for low-bitrate high-quality audio coding.
Awesome speech/audio LLMs, representation learning, and codec models
An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Manipulate audio with a simple and easy high level interface
Code for the paper Hybrid Spectrogram and Waveform Source Separation
Noise supression using deep filtering
GUI for a Vocal Remover that uses Deep Neural Networks.
✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM
Accessible large language models via k-bit quantization for PyTorch.
Paper, Code and Resources for Speech Language Model and End2End Speech Dialogue System.
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
The official repo of NBC & SpatialNet for multichannel speech separation, denoising, and dereverberation
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
Continuation of Clash Verge - A Clash Meta GUI based on Tauri (Windows, MacOS, Linux)