Skip to content
View MIracleyin's full-sized avatar
  • Tencent
  • China
  • 11:52 (UTC +08:00)

Block or report MIracleyin

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

This project aims to collect and collate various datasets for multimodal large model training, including but not limited to pre-training data, instruction fine-tuning data, and In-Context learning …

15 Updated Oct 7, 2024

Deep learning software for colorizing black and white images with a few clicks.

Python 2,702 446 Updated Jul 29, 2022

Style-Text data synthesis tool

Python 33 Updated Dec 9, 2024

Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

182 4 Updated Jan 2, 2025

Synthetic data generation pipelines for Pixmo-docs.

Python 16 3 Updated Dec 5, 2024

💥 Blazing fast terminal file manager written in Rust, based on async I/O.

Rust 19,877 444 Updated Jan 3, 2025

Code for the Molmo Vision-Language Model

Python 200 11 Updated Dec 12, 2024

Python tool for converting files and office documents to Markdown.

Python 31,501 1,294 Updated Jan 4, 2025

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Python 4,051 245 Updated Dec 4, 2024

ETL, Analytics, Versioning for Unstructured Data

Python 2,157 97 Updated Jan 4, 2025

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Python 2,258 189 Updated Dec 30, 2024

轻量、灵活、易上手的Python剪映草稿生成及导出工具,构建全自动化视频剪辑/混剪流水线

Python 272 65 Updated Dec 11, 2024

Re-editable LaTeX/ typst graphics for Inkscape

Python 875 41 Updated Jan 3, 2025

kaomoji render for LaTeX

Python 2 Updated Nov 29, 2024

Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.

Jupyter Notebook 2,497 163 Updated Dec 20, 2024

Automatic colorization using deep neural networks. "Colorful Image Colorization." In ECCV, 2016.

Python 3,370 926 Updated Nov 27, 2023

Simple package to extract text with coordinates from programmatic PDFs

C++ 42 10 Updated Dec 16, 2024

Running Docling as an API service

Python 36 10 Updated Dec 19, 2024

Tool to parse wiki tables from the HTML dump of Wikipedia

Python 11 1 Updated Jun 12, 2022

Table Structure Recognition

63 3 Updated Mar 11, 2023

difPy - Python package for finding duplicate and similar images

Python 473 68 Updated Jan 2, 2025

🎁 5,400,000+ Unsplash images made available for research and machine learning

Jupyter Notebook 2,471 122 Updated Feb 9, 2024

qpdf: A content-preserving PDF document transformer

C++ 3,613 286 Updated Dec 14, 2024

A python library to define and validate data types in Docling.

Python 47 21 Updated Dec 19, 2024

Get your documents ready for gen AI

Python 17,280 901 Updated Jan 3, 2025

Zotero Plugins Collection | Zotero 插件合集 | Awesome Zotero Plugins

TypeScript 453 21 Updated Jan 3, 2025

A plugin template for Zotero.

TypeScript 498 117 Updated Dec 31, 2024

ImageBind One Embedding Space to Bind Them All

Python 8,452 781 Updated Jul 31, 2024

Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

Python 4,360 226 Updated Dec 31, 2024

A large scale camera-taken table detection and recognition dataset.

Python 117 8 Updated Oct 12, 2023
Next