Stars
🚀 Next Generation AI One-Stop Internationalization Solution. 🚀 下一代 AI 一站式 B/C 端解决方案,支持 OpenAI,Midjourney,Claude,讯飞星火,Stable Diffusion,DALL·E,ChatGLM,通义千问,腾讯混元,360 智脑,百川 AI,火山方舟,新必应,Gemini,Moonshot …
The official implementation of GTCRN, an ultra-lite speech enhancement model.
Efficient Multimodal Large Language Models: A Survey
The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.
A generative speech model for daily dialogue.
Control adaptive filters with neural networks.
An evolving, large-scale and multi-domain ASR corpus for low-resource languages with automated crawling, transcription and refinement
The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, and more.
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
A Collection of Useful C++ Classes for Digital Signal Processing
Speech commands recognition with PyTorch | Kaggle 10th place solution in TensorFlow Speech Recognition Challenge
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
Audio Codec Speech processing Universal PERformance Benchmark
The Robot Operating System, is a meta operating system for robots.
Acoular - Acoustic testing and source mapping software
End-to-End Automatic Speech Recognition on PyTorch
Llama中文社区,Llama3在线体验和微调模型已开放,实时汇总最新Llama3学习资料,已将所有代码更新适配Llama3,构建最好的中文Llama大模型,完全开源可商用
Get up and running with Llama 3.3, Phi 4, Gemma 2, and other large language models.
The repo provides information about KeSpeech dataset.
面向开发者的 LLM 入门教程,吴恩达大模型系列课程中文版
Keyword spotting, Speech wake_up, by pytorch, DNN, CNN, TDNN, DFSMN, LSTM
Learning Efficient Representations for Keyword Spotting with Triplet Loss