YoungerGao

YoungerGao

2 followers · 1 following

Stars

shilinyan99 / AIDE

「ICLR 2025」 A Sanity Check for AI-generated Image Detection

Python 74 4 Updated Jan 23, 2025

TalkUHulk / Awesome-CLIP

A curated list of research based on CLIP.

175 10 Updated Nov 17, 2024

CorentinJ / Real-Time-Voice-Cloning

Clone a voice in 5 seconds to generate arbitrary speech in real-time

Python 53,617 8,897 Updated Aug 14, 2024

PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and…

Python 46,853 8,034 Updated Feb 26, 2025

zwx8981 / LIQE

[CVPR2023] Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective

Python 214 12 Updated Jan 10, 2025

chencn2020 / PromptIQA

The public code for "PromptIQA: Boosting the Performance and Generalization for No-Reference Image Quality Assessment via Prompts"

Python 21 1 Updated Feb 28, 2025

zihuixue / DynMM

Code for the paper 'Dynamic Multimodal Fusion'

Python 105 13 Updated Apr 6, 2023

Ucas-HaoranWei / GOT-OCR2.0

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Python 7,018 616 Updated Feb 10, 2025

nuwandda / yolov7-logo-detection

Logo detection using YOLOv7 with LogoDet-3K and Flickr Logos 27.

Python 42 11 Updated Dec 2, 2024

LaotechLabs / LOGOS

A Brand Independent logo detection model

Jupyter Notebook 30 6 Updated Jan 1, 2024

Arhosseini77 / Brand_Attention

Brand Visibility in Packaging: A Deep Learning Approach for Logo Detection, Saliency-Map Prediction, and Logo Placement Analysis

Python 28 6 Updated Aug 13, 2024

chaofengc / Awesome-Image-Quality-Assessment

A comprehensive collection of IQA papers

TeX 1,113 73 Updated Jan 5, 2025

facebookresearch / ImageBind

ImageBind One Embedding Space to Bind Them All

Python 8,522 793 Updated Jul 31, 2024

mlfoundations / datacomp

DataComp: In search of the next generation of multimodal datasets

Python 682 56 Updated Jan 2, 2024

rom1504 / img2dataset

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

Python 3,919 347 Updated Aug 7, 2024

QwenLM / Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Python 5,546 421 Updated Aug 7, 2024

OpenGVLab / all-seeing

[ICLR 2024 & ECCV 2024] The All-Seeing Projects: Towards Panoptic Visual Recognition&Understanding and General Relation Comprehension of the Open World"

Python 475 17 Updated Aug 9, 2024