Skip to content
View keakon's full-sized avatar

Block or report keakon

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Distributed task queue with full async support

Python 976 57 Updated Feb 4, 2025

A Comprehensive Toolkit for High-Quality PDF Content Extraction

Python 6,529 438 Updated Jan 3, 2025

Community maintained fork of pdfminer - we fathom PDF

Python 6,181 946 Updated Aug 2, 2024

Open source Python library for converting PDF to DOCX.

Python 2,737 394 Updated Sep 23, 2024

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

Python 6,388 566 Updated Feb 6, 2025

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.

HTML 10,006 837 Updated Feb 7, 2025

A curated list of resources for using LLMs to develop more competitive grant applications.

Python 3,476 446 Updated Mar 1, 2024

HyperOS enhancement module - Make HyperOS Great Again!

Java 3,130 187 Updated Feb 7, 2025

Efficient Triton Kernels for LLM Training

Python 4,352 259 Updated Feb 6, 2025

A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。

Python 25,320 1,916 Updated Feb 7, 2025

Convert PDF to markdown + JSON quickly with high accuracy

Python 20,380 1,217 Updated Feb 6, 2025

🚀🚀 「大模型」50分钟完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 50 min!

Python 7,844 804 Updated Dec 13, 2024

Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.

Python 2,420 171 Updated Aug 21, 2024

Official code for our paper, "LoRA-Pro: Are Low-Rank Adapters Properly Optimized? "

Python 98 5 Updated Oct 24, 2024

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.

Python 8,341 836 Updated Feb 5, 2025

PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models(NeurIPS 2024 Spotlight)

Jupyter Notebook 315 14 Updated Feb 1, 2025

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and…

Python 46,193 7,981 Updated Feb 6, 2025

A modular graph-based Retrieval-Augmented Generation (RAG) system

Python 22,153 2,196 Updated Feb 7, 2025

🚀 KIMI AI 长文本大模型逆向API【特长:长文本解读整理】,支持高速流式输出、智能体对话、联网搜索、探索版、K1思考模型、长文档解读、图像解析、多轮对话,零配置部署,多路token支持,自动清理会话痕迹,仅供测试,如需商用请前往官方开放平台。

TypeScript 4,211 704 Updated Dec 30, 2024

Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/

Rust 21,662 1,488 Updated Feb 7, 2025

A repository of code samples for Vector search capabilities in Azure AI Search.

Jupyter Notebook 787 337 Updated Feb 6, 2025

Netease Youdao's open-source embedding and reranker models for RAG products.

Python 1,588 107 Updated Feb 5, 2025

Compile type annotated Python to fast C extensions

1,788 47 Updated Apr 17, 2023

A fast and powerful RPC framework based on ASGI/WSGI.

Python 199 13 Updated Jul 6, 2024

Windows inside a Docker container.

Shell 32,392 2,258 Updated Feb 6, 2025

Real asynchronous file operations with asyncio support.

Python 528 29 Updated Oct 8, 2024

AirLLM 70B inference with single 4GB GPU

Jupyter Notebook 5,654 450 Updated Nov 24, 2024

A fast asyncio MySQL/MariaDB driver with replication protocol support

Python 275 33 Updated Dec 13, 2024

A fast serialization and validation library, with builtin support for JSON, MessagePack, YAML, and TOML

Python 2,583 86 Updated Dec 27, 2024
Next