Skip to content
View YoungerGao's full-sized avatar

Block or report YoungerGao

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

「ICLR 2025」 A Sanity Check for AI-generated Image Detection

Python 74 4 Updated Jan 23, 2025

A curated list of research based on CLIP.

175 10 Updated Nov 17, 2024

Clone a voice in 5 seconds to generate arbitrary speech in real-time

Python 53,617 8,897 Updated Aug 14, 2024

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and…

Python 46,853 8,034 Updated Feb 26, 2025

[CVPR2023] Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective

Python 214 12 Updated Jan 10, 2025

The public code for "PromptIQA: Boosting the Performance and Generalization for No-Reference Image Quality Assessment via Prompts"

Python 21 1 Updated Feb 28, 2025

Code for the paper 'Dynamic Multimodal Fusion'

Python 105 13 Updated Apr 6, 2023

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Python 7,018 616 Updated Feb 10, 2025

Logo detection using YOLOv7 with LogoDet-3K and Flickr Logos 27.

Python 42 11 Updated Dec 2, 2024

A Brand Independent logo detection model

Jupyter Notebook 30 6 Updated Jan 1, 2024

Brand Visibility in Packaging: A Deep Learning Approach for Logo Detection, Saliency-Map Prediction, and Logo Placement Analysis

Python 28 6 Updated Aug 13, 2024

A comprehensive collection of IQA papers

TeX 1,113 73 Updated Jan 5, 2025

ImageBind One Embedding Space to Bind Them All

Python 8,522 793 Updated Jul 31, 2024

DataComp: In search of the next generation of multimodal datasets

Python 682 56 Updated Jan 2, 2024

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

Python 3,919 347 Updated Aug 7, 2024

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Python 5,546 421 Updated Aug 7, 2024

[ICLR 2024 & ECCV 2024] The All-Seeing Projects: Towards Panoptic Visual Recognition&Understanding and General Relation Comprehension of the Open World"

Python 475 17 Updated Aug 9, 2024

[COLM 2024] LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition

Python 615 37 Updated Jul 22, 2024

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 20,824 2,597 Updated Feb 6, 2025

The unofficial python package that returns response of Google Bard through cookie value.

Python 5,288 523 Updated Apr 24, 2024

🦩 Visual Instruction Tuning with Polite Flamingo - training multi-modal LLMs to be both clever and polite! (AAAI-24 Oral)

Python 63 3 Updated Dec 9, 2023

zero零训练llm调参

31 3 Updated Jul 20, 2023

Awesome Pretrained Chinese NLP Models,高质量中文预训练模型&大模型&多模态模型&大语言模型集合

Python 5,123 489 Updated Feb 26, 2025

[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning

Python 271 13 Updated Mar 13, 2024

✨✨Latest Advances on Multimodal Large Language Models

14,056 898 Updated Mar 1, 2025

A large-scale 7B pretraining language model developed by BaiChuan-Inc.

Python 5,693 507 Updated Jul 18, 2024

食铁兽 ffmpeg4/5/6入门系列教程代码

C++ 304 112 Updated Nov 30, 2023

PyTorch使用BERT进行英语多标签文本分类

Python 33 11 Updated Mar 12, 2022

使用Bert,ERNIE,进行中文文本分类

Python 1 Updated Mar 25, 2022

VRT: A Video Restoration Transformer (official repository)

Python 1,415 134 Updated Jun 18, 2023
Next