Stars
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
Real time interactive streaming digital human
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
Convert PDF to markdown + JSON quickly with high accuracy
OCR, layout analysis, reading order, table recognition in 90+ languages
Interactive app(s) for showing fireworks using Flutter's canvas.
flutter版本 字节跳动官方原生组件AlphaPlayer
🦜🔗 Build context-aware reasoning applications
一个简单的本地网页界面,使用ChatTTS将文字合成为语音,同时支持对外提供API接口。A simple native web interface that uses ChatTTS to synthesize text into speech, along with support for external API interfaces.
Faster Whisper transcription with CTranslate2
A generative speech model for daily dialogue.
FastGPT is a knowledge-based platform built on the LLMs, offers a comprehensive suite of out-of-the-box capabilities such as data processing, RAG retrieval, and visual AI workflow orchestration, le…
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
最全中华古诗词数据库, 唐宋两朝近一万四千古诗人, 接近5.5万首唐诗加26万宋诗. 两宋时期1564位词人,21050首词。
微信公众号文章批量下载工具,支持评论、合集下载,支持保存html/mhtml/md/pdf/docx文件,保存文章内图片、视频、音频文件
Visualise velocity data on a leaflet layer
Provider for Chinese Tms Service
A Material Design Weather Application
💬 Ready-to-use & flexible RAG Chatbot, supporting mainstream large language models (LLMs) such as DeepSeek-R1, Llama 3.3, Qwen2, OpenAI, and more.
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
🔊 Text-Prompted Generative Audio Model
基于大模型搭建的聊天机器人,同时支持 微信公众号、企业微信应用、飞书、钉钉 等接入,可选择GPT3.5/GPT-4o/GPT-o1/ DeepSeek/Claude/文心一言/讯飞星火/通义千问/ Gemini/GLM-4/Claude/Kimi/LinkAI,能处理文本、语音和图片,访问操作系统和互联网,支持基于自有知识库进行定制企业智能客服。