Stars
All Algorithms implemented in Python
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
Robust Speech Recognition via Large-Scale Weak Supervision
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
The world's simplest facial recognition api for Python and the command line
YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and…
real time face swap and one-click video deepfake with only a single image
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
High-Resolution Image Synthesis with Latent Diffusion Models
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
A generative speech model for daily dialogue.
A modular graph-based Retrieval-Augmented Generation (RAG) system
DeepFaceLab is the leading software for creating deepfakes.
ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型
[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch
PyTorch package for the discrete VAE used for DALL·E.
High-Resolution 3D Human Digitization from A Single Image.
The code for our newly accepted paper in Pattern Recognition 2020: "U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection."
Easy-to-use image segmentation library with awesome pre-trained model zoo, supporting wide-range of practical tasks in Semantic Segmentation, Interactive Segmentation, Panoptic Segmentation, Image …
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
vits2 backbone with multilingual-bert
PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models
Stable Diffusion built-in to Blender