Stars
A feature-rich command-line audio/video downloader
YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
A Gradio web UI for Large Language Models with support for multiple inference backends.
A high-throughput and memory-efficient inference and serving engine for LLMs
Easily train a good VC model with voice data <= 10 mins!
Universal LLM Deployment Engine with ML Compilation
The best free and open-source automated time tracker. Cross-platform, extensible, privacy-focused.
[CVPR 2024] MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model
Custom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)
Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion.
Proxy server to bypass Cloudflare protection
Streamlined interface for generating images with AI in Krita. Inpaint and outpaint with optional text prompt, no tweaking required.
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
🪩 Create Disco Diffusion artworks in one line
A scriptable music downloader for Qobuz, Tidal, SoundCloud, and Deezer
A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
Convolutional Neural Networks to predict the aesthetic and technical quality of images.
automated censorship evasion for the client-side and server-side
A simple, high-quality voice conversion tool focused on ease of use and performance.
[ICCV 2023] Make-It-3D: High-Fidelity 3D Creation from A Single Image with Diffusion Prior
Transcription, forced alignment, and audio indexing with OpenAI's Whisper
Versatile audio super resolution (any -> 48kHz) with AudioSR.
Audio super resolution using neural networks
Upload and download files from Telegram up to 4 GiB using your account
*CREPE+HYBRID TRAINING* A very experimental fork of the Retrieval-based-Voice-Conversion-WebUI repo that incorporates a variety of other f0 methods, along with a hybrid f0 nanmedian method.
The code for the bark-voicecloning model. Training and inference.