Stars
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Interact with your documents using the power of GPT, 100% privately, no data leaks
提取微信聊天记录,将其导出成HTML、Word、Excel文档永久保存,对聊天记录进行分析生成年度聊天报告,用聊天数据训练专属于个人的AI聊天助手
Python tool for converting files and office documents to Markdown.
OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
We write your reusable computer vision tools. 💜
A modular graph-based Retrieval-Augmented Generation (RAG) system
PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/Docker/Zotero
OCR, layout analysis, reading order, table recognition in 90+ languages
SMSBoom - Deprecate: Due to judicial reasons, the repository has been suspended!
FauxPilot - an open-source alternative to GitHub Copilot server
💬 Ready-to-use, flexible RAG Chatbot. 基于大模型和 RAG 的知识库问答系统。
[CVPR 2024] MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model
YOLOv10: Real-Time End-to-End Object Detection [NeurIPS 2024]
Pure bash script to test and wait on the availability of a TCP host and port
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
Enjoy the magic of Diffusion models!
🚀🎬 ShortGPT - Experimental AI framework for youtube shorts / tiktok channel automation
A suite of utilities for converting to and working with CSV, the king of tabular file formats.
Agent framework and applications built upon Qwen>=2.0, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.
AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.
Uses WiFi signals 📶 and machine learning to predict where you are
[CVPR 2024] An End-to-End Tile-Based Framework for High-Resolution Monocular Metric Depth Estimation