Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.

Python 6,037 519 Updated Sep 6, 2024

youngkyunJang / VDG

Visual Delta Generator with Large Multi-modal Model for Semi-supervised Composed Image Retrieval - CVPR2024

Python 18 1 Updated May 30, 2024

Julia-LiuJ / NLFT

The official implementation of Natural Language Fine-Tuning

Python 44 3 Updated Jan 7, 2025

ViTAE-Transformer / ViTDet

Unofficial implementation for [ECCV'22] "Exploring Plain Vision Transformer Backbones for Object Detection"

Python 554 46 Updated Apr 24, 2022

congvvc / HyperSeg

Project for "HyperSeg: Towards Universal Visual Segmentation with Large Language Model".

Python 108 2 Updated Dec 13, 2024

SCZwangxiao / RTQ-MM2023

ACM Multimedia 2023 (Oral) - RTQ: Rethinking Video-language Understanding Based on Image-text Model

Python 16 3 Updated Jan 31, 2024

caoyunkang / AdaCLIP

[ECCV2024] The Official Implementation for ''AdaCLIP: Adapting CLIP with Hybrid Learnable Prompts for Zero-Shot Anomaly Detection''

Python 196 9 Updated Dec 26, 2024

zhengli97 / Awesome-Prompt-Adapter-Learning-for-VLMs

A curated list of awesome prompt/adapter learning methods for vision-language models like CLIP.

438 18 Updated Feb 13, 2025

FoundationVision / VAR

[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…

Jupyter Notebook 6,774 444 Updated Jan 12, 2025

m-bain / clip-hitchhiker

A Clip-Hitchiker's Guide to Long Video Retrieval [Arxiv 2022]

Python 10 1 Updated Jan 30, 2024

Visual-AI / FROSTER

The official repository for ICLR2024 paper "FROSTER: Frozen CLIP is a Strong Teacher for Open-Vocabulary Action Recognition"

Python 70 5 Updated Jan 14, 2025

haokunwen / Awesome-Composed-Image-Retrieval

Collection of Composed Image Retrieval (CIR) papers.

143 8 Updated Feb 27, 2025

jmhessel / clipscore

CLIPScore EMNLP code

Python 211 27 Updated Dec 16, 2022

wlin-at / MAXI

MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge (ICCV 2023)

Python 29 4 Updated Sep 5, 2023

ExplainableML / EgoCVR

[ECCV 2024] EgoCVR: An Egocentric Benchmark for Fine-Grained Composed Video Retrieval

Python 35 Updated Aug 27, 2024

JUNJIE99 / VISTA_Evaluation_FineTuning

Evaluation code and datasets for the ACL 2024 paper, VISTA: Visualized Text Embedding for Universal Multi-Modal Retrieval. The original code and model can be accessed at FlagEmbedding.

Howard Wang wanghao15536870732

Highlights

Organizations

Lists (1)

Segmentation

Starred repositories

action-recognition

semantic-segmentation

flutter-apps