Skip to content
View wyc2015fq's full-sized avatar

Block or report wyc2015fq

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

【ArXiv】PDF-Wukong: A Large Multimodal Model for Efficient Long PDF Reading with End-to-End Sparse Sampling

115 4 Updated Oct 18, 2024

An open-source implementaion for fine-tuning Qwen2-VL and Qwen2.5-VL series by Alibaba Cloud.

Python 494 55 Updated Mar 21, 2025

Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 8,960 631 Updated Mar 7, 2025

MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone

Python 19,030 1,374 Updated Mar 3, 2025

A Survey on Deepfake Generation and Detection

433 20 Updated Feb 21, 2025

Kolors Team

Python 4,286 323 Updated Nov 13, 2024

Dataset and Code for our ACL 2024 paper: "Multimodal Table Understanding". We propose the first large-scale Multimodal IFT and Pre-Train Dataset for table understanding and develop a generalist tab…

Python 190 7 Updated Sep 27, 2024

Lumina-T2X is a unified framework for Text to Any Modality Generation

Python 2,165 91 Updated Feb 16, 2025

【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models

Python 1,731 123 Updated Mar 20, 2025

Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

Jupyter Notebook 4,003 334 Updated Jan 13, 2025

PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

Python 1,779 87 Updated Oct 31, 2024

Open-Sora: Democratizing Efficient Video Production for All

Python 25,734 2,475 Updated Mar 20, 2025

[CVPR 2024] This is the official source for our paper "SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis"

Python 1,435 168 Updated Aug 28, 2024

Codebase for fine-tuning / evaluating nougat-based image2latex generation models

Python 144 19 Updated Sep 25, 2024

成员在ICCV、CVPR等CV顶会发表的论文,在ICDAR等比赛中的成果

1 Updated Jun 21, 2023

This repo is official implementation of HumanBench (CVPR2023)

Python 2 Updated Apr 12, 2023

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Jupyter Notebook 49,386 5,832 Updated Sep 18, 2024

基于Pytorch的OCR工具库,支持常用的文字检测和识别算法

Python 1,435 309 Updated Sep 2, 2024

A pytorch re-implementation of Real-time Scene Text Detection with Differentiable Binarization

Python 975 253 Updated Dec 29, 2022

Unofficial PyTorch implementation of 2D Attentional Irregular Scene Text Recognizer

Python 132 34 Updated May 12, 2020

[ECCV 2018] CCPD: a diverse and well-annotated dataset for license plate detection and recognition

Python 2,340 576 Updated Feb 15, 2024

基于人脸关键区域提取的人脸识别(LFW:99.82%+ CFP_FP:98.50%+ AgeDB30:98.25%+)

Python 273 71 Updated Apr 8, 2021

2019CCF-BDCI大赛 最佳创新探索奖获得者 基于OCR身份证要素提取赛题冠军 天晨破晓团队 赛题源码

Python 889 313 Updated Jul 25, 2024

CCF2019-OCR身份证要素识别-数据生成器

Python 152 49 Updated Jan 4, 2021

A PyTorch implementation of "Real-time Scene Text Detection with Differentiable Binarization".

Python 2,147 479 Updated Mar 11, 2024

Code for the paper "DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks" (ICCV '19)

Python 1 Updated Nov 10, 2019

💎1MB lightweight face detection model (1MB轻量级人脸检测模型)

Python 7,306 1,537 Updated Dec 29, 2023

Train code of face anti-spoofing with a single RGB frame

Python 131 29 Updated Jul 30, 2019

HED and RCF implementation for edge detection on Tensorflow

Python 4 2 Updated Mar 7, 2019

ChaLearn Face Anti-spoofing Attack Detection Challenge@CVPR2019

Python 412 90 Updated Dec 8, 2022
Next