Stars
OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
Auto_Jobs_Applier_AI_Agent aims to easy job hunt process by automating the job application process. Utilizing artificial intelligence, it enables users to apply for multiple jobs in an automated an…
☁️ Build multimodal AI applications with cloud-native stack
An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.
PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/Docker
🚀 一键部署!真正的 AI 聊天机器人!支持ChatGPT、文心一言、讯飞星火、Bing、Bard、ChatGLM、POE,多账号,人设调教,虚拟女仆、图片渲染、语音发送 | 支持 QQ、Telegram、Discord、微信 等平台
Question and Answer based on Anything.
Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言,同时支持语音识别转录、语音合成、字幕翻译。
Collection of awesome LLM apps with RAG using OpenAI, Anthropic, Gemini and opensource models.
Netflix-level subtitle cutting, translation, alignment, and even dubbing - one-click fully automated AI video subtitle team | Netflix级字幕切割、翻译、对齐、甚至加上配音,一键全自动视频搬运AI字幕组
a state-of-the-art-level open visual language model | 多模态预训练模型
Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
Agent framework and applications built upon Qwen>=2.0, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.
Tiled Diffusion and VAE optimize, licensed under CC BY-NC-SA 4.0
TEN Agent is a conversational AI powered by the TEN, integrating Gemini 2.0 Live, OpenAI Realtime, RTC, and more. It delivers real-time capabilities to see, hear, and speak, while being fully compa…
o1-engineer is a command-line tool designed to assist developers in managing and interacting with their projects efficiently. Leveraging the power of OpenAI's API, this tool provides functionalitie…
Dead simple FLUX LoRA training UI with LOW VRAM support
本项目为 chatgpt-on-wechat下游分支, 额外对接了LLMOps平台 Dify,支持Dify智能助手模式,调用工具和知识库,支持Dify工作流。
Browser automation system that uses AI-driven planning to navigate web pages and perform goals.
ChatTTS 2000条音色稳定性打分🥇+区分男女年龄👧+在线试听🔈 ChatTTS 2K Speaker Stability Score & Categorized by Gender and Age & Audio Preview
From comfyui workflow to web app, in seconds