DAVID-Hown

💭

I may be slow to respond.

dawei-hao DAVID-Hown

💭

I may be slow to respond.

4 followers · 6 following

Starred repositories

ndrwmlnk / Awesome-Video-Diffusion-Models

38 Updated Jul 5, 2024

diff-usion / Awesome-Diffusion-Models

A collection of resources and papers on Diffusion Models

HTML 11,338 955 Updated Aug 1, 2024

WisconsinAIVision / visii

👀 Visual Instruction Inversion: Image Editing via Visual Prompting (NeurIPS 2023)

Python 88 2 Updated Dec 19, 2023

PRIV-Creation / Awesome-Controllable-T2I-Diffusion-Models

A collection of resources on controllable generation with text-to-image diffusion models.

970 27 Updated Dec 31, 2024

Yutong-Zhou-cv / Awesome-Text-to-Image

(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.

2,247 196 Updated Jan 6, 2025

cubiq / ComfyUI_IPAdapter_plus

Python 4,492 340 Updated Sep 13, 2024

s9roll7 / animatediff-cli-prompt-travel

Forked from neggles/animatediff-cli

animatediff prompt travel

Python 1,194 104 Updated Jan 13, 2024

tencent-ailab / IP-Adapter

The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.

Jupyter Notebook 5,552 351 Updated Jun 28, 2024

fofr / cog-consistent-character

Create images of a given character in different poses

Python 626 65 Updated Jun 5, 2024

lllyasviel / Fooocus

Focus on prompting and generating

Python 42,685 6,248 Updated Jan 14, 2025

dilithjay / Sinhala-ParSeq

Forked from baudm/parseq

Scene Text Recognition with Permuted Autoregressive Sequence Models (ECCV 2022)

Python 2 Updated Jun 23, 2023

levihsu / OOTDiffusion

Official implementation of OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on

Python 5,982 852 Updated May 13, 2024

facebookresearch / mae

PyTorch implementation of MAE https//arxiv.org/abs/2111.06377

Python 7,500 1,233 Updated Jul 23, 2024

thunlp / LLaVA-UHD

LLaVA-UHD v2: an MLLM Integrating High-Resolution Feature Pyramid via Hierarchical Window Transformer

Python 350 15 Updated Jan 14, 2025

hithqd / DynamicControl

33 Updated Jan 10, 2025

Ucas-HaoranWei / GOT-OCR2.0

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Python 6,589 575 Updated Jan 11, 2025

SCUT-DLVCLab / GPT-4V_OCR

Evaluation of the Optical Character Recognition (OCR) capabilities of GPT-4V(ision)

Python 121 4 Updated Nov 13, 2023

OpenGVLab / InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 6,823 524 Updated Dec 25, 2024

amazon-science / auto-cot

Official implementation for "Automatic Chain of Thought Prompting in Large Language Models" (stay tuned & more will be updated)

Jupyter Notebook 1,666 150 Updated Mar 13, 2024

CASIA-IVA-Lab / AnomalyGPT

[AAAI 2024 Oral] AnomalyGPT: Detecting Industrial Anomalies Using Large Vision-Language Models

Python 855 107 Updated Dec 20, 2023

jam-cc / MMAD

The Codes and Data of The First-Ever Comprehensive Benchmark for Multimodal Large Language Models in Industrial Anomaly Detection

Python 49 3 Updated Jan 8, 2025

Q-Future / Q-Align

③[ICML2024] [IQA, IAA, VQA] All-in-one Foundation Model for visual scoring. Can efficiently fine-tune to downstream datasets.

Python 336 24 Updated Aug 12, 2024

Stability-AI / generative-models

Generative Models by Stability AI

Python 25,107 2,783 Updated Sep 4, 2024

Vision-CAIR / MiniGPT-4

Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)

Python 25,534 2,926 Updated Sep 2, 2024

HVision-NKU / StoryDiffusion

Accepted as [NeurIPS 2024] Spotlight Presentation Paper

Jupyter Notebook 6,121 616 Updated Sep 26, 2024

QwenLM / Qwen2-VL

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Python 4,237 259 Updated Jan 11, 2025

lm-sys / FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Python 37,514 4,592 Updated Jan 18, 2025

QwenLM / Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Python 5,323 404 Updated Aug 7, 2024

Alpha-VLLM / LLaMA2-Accessory

An Open-source Toolkit for LLM Development

Python 2,747 176 Updated Jan 13, 2025

ChenRocks / UNITER

Research code for ECCV 2020 paper "UNITER: UNiversal Image-TExt Representation Learning"

Python 787 109 Updated Jun 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly