Skip to content
View MIracleyin's full-sized avatar
  • Tencent
  • China
  • 15:38 (UTC +08:00)

Block or report MIracleyin

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

The official repository for "2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining"

Python 77 6 Updated Jan 5, 2025

This project aims to collect and collate various datasets for multimodal large model training, including but not limited to pre-training data, instruction fine-tuning data, and In-Context learning …

15 Updated Oct 7, 2024

Deep learning software for colorizing black and white images with a few clicks.

Python 2,702 446 Updated Jul 29, 2022

Style-Text data synthesis tool

Python 33 Updated Dec 9, 2024

Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

193 5 Updated Jan 2, 2025

Synthetic data generation pipelines for Pixmo-docs.

Python 16 3 Updated Dec 5, 2024

💥 Blazing fast terminal file manager written in Rust, based on async I/O.

Rust 20,064 446 Updated Jan 6, 2025

Code for the Molmo Vision-Language Model

Python 206 11 Updated Dec 12, 2024

Python tool for converting files and office documents to Markdown.

Python 32,260 1,343 Updated Jan 4, 2025

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Python 4,070 248 Updated Dec 4, 2024

ETL, Analytics, Versioning for Unstructured Data

Python 2,165 97 Updated Jan 6, 2025

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Python 2,266 189 Updated Dec 30, 2024

轻量、灵活、易上手的Python剪映草稿生成及导出工具,构建全自动化视频剪辑/混剪流水线

Python 275 66 Updated Jan 6, 2025

Re-editable LaTeX/ typst graphics for Inkscape

Python 875 41 Updated Jan 3, 2025

kaomoji render for LaTeX

Python 2 Updated Nov 29, 2024

Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.

Jupyter Notebook 2,497 163 Updated Dec 20, 2024

Automatic colorization using deep neural networks. "Colorful Image Colorization." In ECCV, 2016.

Python 3,371 926 Updated Nov 27, 2023

Simple package to extract text with coordinates from programmatic PDFs

C++ 43 10 Updated Dec 16, 2024

Running Docling as an API service

Python 37 10 Updated Dec 19, 2024

Tool to parse wiki tables from the HTML dump of Wikipedia

Python 11 1 Updated Jun 12, 2022

Table Structure Recognition

63 3 Updated Mar 11, 2023

difPy - Python package for finding duplicate and similar images

Python 473 68 Updated Jan 2, 2025

🎁 5,400,000+ Unsplash images made available for research and machine learning

Jupyter Notebook 2,473 122 Updated Feb 9, 2024

qpdf: A content-preserving PDF document transformer

C++ 3,618 286 Updated Jan 5, 2025

A python library to define and validate data types in Docling.

Python 49 21 Updated Dec 19, 2024

Get your documents ready for gen AI

Python 17,417 906 Updated Jan 3, 2025

Zotero Plugins Collection | Zotero 插件合集 | Awesome Zotero Plugins

TypeScript 454 21 Updated Jan 5, 2025

A plugin template for Zotero.

TypeScript 500 117 Updated Dec 31, 2024

ImageBind One Embedding Space to Bind Them All

Python 8,454 782 Updated Jul 31, 2024

Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

Python 4,366 226 Updated Dec 31, 2024
Next