Lists (15)
Sort Name ascending (A-Z)
Stars
Use PEFT or Full-parameter to finetune 400+ LLMs (Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, ...) or 150+ MLLMs (Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, Inter…
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites
Official code for ICCV 2023 Paper: AlignDet: Aligning Pre-training and Fine-tuning in Object Detection.
This is the official code for MobileSAM project that makes SAM lightweight for mobile applications and beyond!
[A toolbox for fun.] Transform Image into Unique Paragraph with ChatGPT, BLIP2, OFA, GRIT, Segment Anything, ControlNet.
[NeurIPS 2023] Official implementation of the paper "Segment Everything Everywhere All at Once"
Relate Anything Model is capable of taking an image as input and utilizing SAM to identify the corresponding mask within the image.
An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) fo…
Efficient computing methods developed by Huawei Noah's Ark Lab
This is the official repository for our recent work: PIDNet
Caption-Anything is a versatile tool combining image segmentation, visual captioning, and ChatGPT, generating tailored captions with diverse controls for user preferences. https://huggingface.co/sp…
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
[ICCV 2023] Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
[CVPR 2023 Best Paper Award] Planning-oriented Autonomous Driving
A playbook for systematically maximizing the performance of deep learning models.
Official GitHub repository for Argoverse dataset
(TPAMI2022) The ImageNet-S benchmark/method for large-scale unsupervised/semi-supervised semantic segmentation.
PyTorch implementation of MoCo: https://arxiv.org/abs/1911.05722
PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO
Official implementation of Score-CAM in PyTorch
Python code for the fast bilateral solver
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
Series of work (ECCV2020, CVPR2021, CVPR2021, ECCV2022) about Compositional Learning for Human-Object Interaction Exploration