Stars
《代码随想录》LeetCode 刷题攻略:200道经典题目刷题顺序,共60w字的详细图解,视频难点剖析,50余张思维导图,支持C++,Java,Python,Go,JavaScript等多语言版本,从此算法学习不再迷茫!🔥🔥 来看看,你会发现相见恨晚!🚀
A PyTorch implementation of EfficientNet
Code of paper 'Stochastic Layer-Wise Shuffle: A Good Practice to Improve Vision Mamba Training'
Noise of Web (NoW) is a challenging noisy correspondence learning (NCL) benchmark containing 100K image-text pairs for robust image-text matching/retrieval models.
the official pytorch implementation of “Mamba-YOLO:SSMs-based for Object Detection”
This repository is the code of paper 'DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark'.
MambaOut: Do We Really Need Mamba for Vision?
VMamba: Visual State Space Models,code is based on mamba
PyTorch implementation of adversarial attacks [torchattacks]
The code of the paper "NExT-Chat: An LMM for Chat, Detection and Segmentation".
It includes two datasets that are used in the downstream tasks for evaluating UIBert: App Similar Element Retrieval data and Visual Item Selection (VIS) data. Both datasets are written TFRecords.
Pix2Seq codebase: multi-tasks with generative modeling (autoregressive and diffusion)
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest
OpenMMLab Pre-training Toolbox and Benchmark
Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型
[CVPR'23] Universal Instance Perception as Object Discovery and Retrieval
YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/
mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
[ICCV 2023] Official implementation of the paper: "DIRE for Diffusion-Generated Image Detection"
Grounded Segment Anything: From Objects to Parts
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"