Stars
The Fully Customizable Desktop Environment for Windows 10/11.
No fortress, purely open ground. OpenManus is Coming.
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and…
Tesseract Open Source OCR Engine (main repository)
本项目用于 SpaceKat DIY 3D鼠标,可将3D动作转换为键鼠按键组合,扩展用法。
Toolkit for linearizing PDFs for LLM datasets/training
Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 3, and other large language models.
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
GitHub Copilot plugin for Typora on both Windows, macOS and Linux, provided through Copilot.vim.
Control your ESP32 projects with a PS3 controller!
Wan: Open and Advanced Large-Scale Video Generative Models
MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
A GPU-accelerated cross-platform terminal emulator and multiplexer written by @wez and implemented in Rust
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!
Janus-Series: Unified Multimodal Understanding and Generation Models
A GUI Agent application based on UI-TARS(Vision-Lanuage Model) that allows you to control your computer using natural language.
Legado 3.0 Book Reader with powerful controls & full functions❤️阅读3.0, 阅读是一款可以自定义来源阅读网络内容的工具,为广大网络文学爱好者提供一种方便、快捷舒适的试读体验。
The most comprehensive database of Chinese poetry 🧶最全中华古诗词数据库, 唐宋两朝近一万四千古诗人, 接近5.5万首唐诗加26万宋诗. 两宋时期1564位词人,21050首词。
An extremely fast Python package and project manager, written in Rust.
DSPy: The framework for programming—not prompting—language models