Starred repositories
The official Python SDK for Model Context Protocol servers and clients
Visualizer for neural network, deep learning and machine learning models
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
DeepStream Libraries offer CVCUDA, NvImageCodec, and PyNvVideoCodec modules as Python APIs for seamless integration into custom frameworks.
Python bindings for FFmpeg - with complex filtering support
OpenVINO™ is an open source toolkit for optimizing and deploying AI inference
This repository contains demos I made with the Transformers library by HuggingFace.
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
[ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.
An efficient video loader for deep learning with smart shuffling that's super easy to digest
A cross-platform High-performance FFmpeg based Real-time Video Frames Decoder in Pure Python 🎞️⚡
[ICCV 2019] TSM: Temporal Shift Module for Efficient Video Understanding
[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
Scenic: A Jax Library for Computer Vision Research and Beyond
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
[CVPR 2025] DEIM: DETR with Improved Matching for Fast Convergence
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
[CVPR 2025] Official implementation of "AnyDressing: Customizable Multi-Garment Virtual Dressing via Latent Diffusion Models"
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
Produce redistributable builds of Python
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
[NeurIPS 2023] This repo contains the code for our paper Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP
GoogleTest - Google Testing and Mocking Framework
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.