Lists (1)
Sort Name ascending (A-Z)
Stars
Self-Supervised Speech Pre-training and Representation Learning Toolkit
GolfDB is a video database for Golf Swing Sequencing, which involves detecting 8 golf swing events in trimmed golf swing videos. This repo demos the baseline model, SwingNet.
Golf swing detection/extraction by computer vision and machine learning techniques. Using Roboflow's object detection model and RNNs in PyTorch
Official Implementations for Paper - MagicQuill: An Intelligent Interactive Image Editing System
AI Agents for Semi-Autonomous Public Goods Production
A small project to track and calculate the speed from a putt.
A simple way to keep track of an Exponential Moving Average (EMA) version of your Pytorch model
Instant voice cloning by MIT and MyShell. Audio foundation model.
[CVPR-2023] The official dataset of Advancing Visual Grounding with Scene Knowledge: Benchmark and Method.
ControlNet++: All-in-one ControlNet for image generations and editing!
Recent LLM-based CV and related works. Welcome to comment/contribute!
A High-Quality Real Time Upscaler for Anime Video
Implementation of Key-Locked Rank One Editing, from Nvidia AI
[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (RPG)
Implementation of Make-A-Video, new SOTA text to video generator from Meta AI, in Pytorch
Image composition toolbox: everything you want to know about image composition or object insertion
Adala: Autonomous DAta (Labeling) Agent framework
A 99% automatized pipeline to construct training set from anime and more for text-to-image model training
π A ranked list of awesome Python open-source libraries and tools. Updated weekly.
π A ranked list of awesome python developer tools and libraries. Updated weekly.
Improved AnimateDiff for ComfyUI and Advanced Sampling Support
Edit anything in images powered by segment-anything, ControlNet, StableDiffusion, etc. (ACM MM)