Skip to content
View xiaochus's full-sized avatar
🐳
dive
🐳
dive

Block or report xiaochus

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition

Python 249 23 Updated Dec 26, 2024

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 2,362 134 Updated Jan 10, 2025

📚 各类图书

645 245 Updated Nov 19, 2021

TDF-ICDAR 2019 Dataset for Typeset Math Formula Detection

Python 67 18 Updated Feb 9, 2020

DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception

Python 750 56 Updated Jan 6, 2025

FormulaNet is a new large-scale Mathematical Formula Detection dataset.

Python 16 10 Updated Nov 21, 2022
Jupyter Notebook 492 23 Updated Aug 23, 2024

DocBank: A Benchmark Dataset for Document Layout Analysis

Python 591 72 Updated Aug 12, 2024

WikiTableSet: A largest publicly available image-based table recognition dataset in three languages built from Wikipedia

Python 27 2 Updated Mar 31, 2023

overview of datasets for ML in chemistry

278 28 Updated Jul 24, 2024

Table Structure Recognition

64 3 Updated Mar 11, 2023

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Python 6,541 573 Updated Jan 11, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 7,273 696 Updated Jan 13, 2025

Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.

Python 2,292 159 Updated Aug 21, 2024

如需体验textin文档解析,请点击https://cc.co/16YSIy

Python 69 7 Updated Nov 11, 2024

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 5,150 457 Updated Jan 13, 2025

Graphic notes on Gilbert Strang's "Linear Algebra for Everyone", 线性代数的艺术中文版, 欢迎PR.

PostScript 4,616 457 Updated Feb 4, 2024

XFUND: A Multilingual Form Understanding Benchmark

192 19 Updated Jul 15, 2022

Formula recognition based on LaTeX-OCR and ONNXRuntime.

Python 319 29 Updated Nov 3, 2024

Chinese Mathematical Formula Detection (MFD) Dataset 中文文档数学公式检测数据集

Python 30 2 Updated Dec 21, 2022

CnSTD: 基于 PyTorch/MXNet 的 中文/英文 场景文字检测(Scene Text Detection)、数学公式检测(Mathematical Formula Detection, MFD)、篇章分析(Layout Analysis)的Python3 包

Python 704 108 Updated Jan 8, 2025

CDLA: A Chinese document layout analysis (CDLA) dataset

Python 254 31 Updated Sep 13, 2021

You can easily calculate FVD, PSNR, SSIM, LPIPS for evaluating the quality of generated or predicted videos.

Python 272 11 Updated Jan 6, 2025

Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据!

Python 3,365 195 Updated Jan 13, 2025

A collection of awesome video generation studies.

TeX 422 15 Updated Jan 1, 2025

ReadingBank: A Benchmark Dataset for Reading Order Detection

96 3 Updated Aug 26, 2024

Open-Sora: Democratizing Efficient Video Production for All

Python 23,072 2,272 Updated Dec 27, 2024

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Python 11,858 1,042 Updated Dec 31, 2024

MiniRBT (中文小型预训练模型系列)

Python 258 17 Updated Apr 5, 2023

The GitButler version control client, backed by Git, powered by Tauri/Rust/Svelte

Rust 13,872 550 Updated Jan 13, 2025
Next