[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…

Jupyter Notebook 6,440 420 Updated Jan 12, 2025

OpenGVLab / DCNv4

[CVPR 2024] Deformable Convolution v4

Python 565 30 Updated May 17, 2024

NVlabs / hymba

Python 153 10 Updated Dec 11, 2024

GuangyanS / Sys2-LLaVA

Python 8 Updated Oct 2, 2024

codalab / codalab-competitions

CodaLab Competitions

Python 516 129 Updated Jan 14, 2025

yangchris11 / samurai

Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"

Python 6,376 396 Updated Jan 5, 2025

IDEA-Research / Grounded-SAM-2

Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2

Jupyter Notebook 1,551 145 Updated Dec 21, 2024

IDEA-Research / Grounded-Segment-Anything

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

Jupyter Notebook 15,616 1,440 Updated Sep 5, 2024

dvlab-research / LongLoRA

Code and documents of LongLoRA and LongAlpaca (ICLR 2024 Oral)

Python 2,642 278 Updated Aug 14, 2024

UX-Decoder / Segment-Everything-Everywhere-All-At-Once

[NeurIPS 2023] Official implementation of the paper "Segment Everything Everywhere All at Once"

Python 4,461 413 Updated Aug 19, 2024

albumentations-team / albumentations

Fast and flexible image augmentation library. Paper about the library: https://www.mdpi.com/2078-2489/11/2/125

Python 14,504 1,660 Updated Jan 19, 2025

openai / swarm

Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.

Python 18,096 1,885 Updated Oct 15, 2024

binary-husky / gpt_academic

为GPT/GLM等LLM大语言模型提供实用化交互接口，特别优化论文阅读/润色/写作体验，模块化设计，支持自定义快捷按钮&函数插件，支持Python和C++等项目剖析&自译解功能，PDF/LaTex论文翻译&总结功能，支持并行问询多种LLM模型，支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…

Python 67,024 8,229 Updated Jan 21, 2025

aigc-apps / EasyAnimate

📺 An End-to-End Solution for High-Resolution and Long Video Generation Based on Transformer Diffusion

Python 1,693 122 Updated Jan 22, 2025

MzeroMiko / VMamba

VMamba: Visual State Space Models，code is based on mamba

Python 2,364 155 Updated Oct 28, 2024

Cranial-XIX / longhorn

Official PyTorch Implementation of the Longhorn Deep State Space Model

Python 45 3 Updated Dec 4, 2024

Westlake-AI / MogaNet

[ICLR 2024] MogaNet: Efficient Multi-order Gated Aggregation Network

Jupyter Notebook 213 14 Updated Mar 5, 2024

facebookresearch / sam2

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 13,748 1,362 Updated Dec 25, 2024

chenziwenhaoshuai / Vision-Mamba2

Vision Mamba 2: More Efficient Visual Representation Learning with State Space Duality

27 Updated Jun 12, 2024

BradyFU / Awesome-Multimodal-Large-Language-Models

✨✨Latest Advances on Multimodal Large Language Models

13,617 871 Updated Jan 17, 2025