MIracleyin

Yin Zhang MIracleyin

Scientific NLP, Science of Science, Recommendation Systems, OS, Rust

42 followers · 157 following

Tencent
China
15:38 (UTC +08:00)

Achievements

Lists (17)

Sort

Starred repositories

DAMO-NLP-SG / multimodal_textbook

The official repository for "2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining"

Python 77 6 Updated Jan 5, 2025

jinbo0906 / Awesome-MLLM-Datasets

This project aims to collect and collate various datasets for multimodal large model training, including but not limited to pre-training data, instruction fine-tuning data, and In-Context learning …

15 Updated Oct 7, 2024

junyanz / interactive-deep-colorization

Deep learning software for colorizing black and white images with a few clicks.

Python 2,702 446 Updated Jul 29, 2022

PFCCLab / StyleText

Style-Text data synthesis tool

Python 33 Updated Dec 9, 2024

LMM101 / Awesome-Multimodal-Next-Token-Prediction

Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

193 5 Updated Jan 2, 2025

allenai / pixmo-docs

Synthetic data generation pipelines for Pixmo-docs.

Python 16 3 Updated Dec 5, 2024

sxyazi / yazi

💥 Blazing fast terminal file manager written in Rust, based on async I/O.

Rust 20,064 446 Updated Jan 6, 2025

allenai / molmo

Code for the Molmo Vision-Language Model

Python 206 11 Updated Dec 12, 2024

microsoft / markitdown

Python tool for converting files and office documents to Markdown.

Python 32,260 1,343 Updated Jan 4, 2025

QwenLM / Qwen2-VL

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Python 4,070 248 Updated Dec 4, 2024

iterative / datachain

ETL, Analytics, Versioning for Unstructured Data

Python 2,165 97 Updated Jan 6, 2025

NVlabs / VILA

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Python 2,266 189 Updated Dec 30, 2024

GuanYixuan / pyJianYingDraft

轻量、灵活、易上手的Python剪映草稿生成及导出工具，构建全自动化视频剪辑/混剪流水线

Python 275 66 Updated Jan 6, 2025

textext / textext

Re-editable LaTeX/ typst graphics for Inkscape

Python 875 41 Updated Jan 3, 2025

clerkma / kaomoji

kaomoji render for LaTeX

Python 2 Updated Nov 29, 2024

google-research / big_vision

Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.

Jupyter Notebook 2,497 163 Updated Dec 20, 2024

richzhang / colorization

Automatic colorization using deep neural networks. "Colorful Image Colorization." In ECCV, 2016.

Python 3,371 926 Updated Nov 27, 2023

DS4SD / docling-parse

Simple package to extract text with coordinates from programmatic PDFs

C++ 43 10 Updated Dec 16, 2024

DS4SD / docling-serve

Running Docling as an API service

Python 37 10 Updated Dec 19, 2024

phucty / wtabhtml

Tool to parse wiki tables from the HTML dump of Wikipedia

Python 11 1 Updated Jun 12, 2022

FutureRising007 / Table_Structure_Recognition

Table Structure Recognition

63 3 Updated Mar 11, 2023

elisemercury / Duplicate-Image-Finder

difPy - Python package for finding duplicate and similar images

Python 473 68 Updated Jan 2, 2025

unsplash / datasets

🎁 5,400,000+ Unsplash images made available for research and machine learning

Jupyter Notebook 2,473 122 Updated Feb 9, 2024

qpdf / qpdf

qpdf: A content-preserving PDF document transformer

C++ 3,618 286 Updated Jan 5, 2025

DS4SD / docling-core

A python library to define and validate data types in Docling.

Python 49 21 Updated Dec 19, 2024

DS4SD / docling

Get your documents ready for gen AI

Python 17,417 906 Updated Jan 3, 2025

zotero-chinese / zotero-plugins

Zotero Plugins Collection | Zotero 插件合集 | Awesome Zotero Plugins

TypeScript 454 21 Updated Jan 5, 2025

windingwind / zotero-plugin-template

A plugin template for Zotero.

TypeScript 500 117 Updated Dec 31, 2024

facebookresearch / ImageBind

ImageBind One Embedding Space to Bind Them All

Python 8,454 782 Updated Jul 31, 2024

facebookresearch / lingua

Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

Python 4,366 226 Updated Dec 31, 2024

Starred topics

citation-recommendation

causal-inference

$latex logo$

Yin Zhang MIracleyin

Lists (17)

Userful tools

Powerful library

Paper implementation

Data analysis

Course

Paper Idea

rice

explainability

language

interesting

LLM-eval

LLM-tuning

LLM-tool

Dataset

MLLM

latex

template

Starred repositories

citation-recommendation

causal-inference

LaTeX