Stars
Robust Speech Recognition via Large-Scale Weak Supervision
The Python micro framework for building web applications.
Clone a voice in 5 seconds to generate arbitrary speech in real-time
YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
Instant voice cloning by MIT and MyShell.
A complete and graceful API for Wechat. 微信个人号接口、微信机器人及命令行微信,三十行即可自定义个人号机器人。
Easily train a good VC model with voice data <= 10 mins!
Best Practices on Recommendation Systems
Use ChatGPT to summarize the arXiv papers. 全流程加速科研,利用chatgpt进行论文全文总结+专业翻译+润色+审稿+审稿回复
GUI for a Vocal Remover that uses Deep Neural Networks.
newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
so-vits-svc fork with realtime support, improved interface and more features.
Companion code to my O'Reilly book "Flask Web Development", second edition.
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/
SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 12+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
Official implementations for various pre-training models of ERNIE-family, covering topics of Language Understanding & Generation, Multimodal Understanding & Generation, and beyond.
OpenMMLab Pose Estimation Toolbox and Benchmark.
A PyTorch implementation of NeRF (Neural Radiance Fields) that reproduces the results.
Muzic: Music Understanding and Generation with Artificial Intelligence
Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型
Efficient 3D human pose estimation in video using 2D keypoint trajectories
Open source Structure-from-Motion pipeline
Aggregates RSS and web content(Calibre recipe), sends to Kindle, and includes an e-ink optimized online reader.