Starred repositories
Command-line program to download videos from YouTube.com and other video sites
Platform to experiment with the AI Software Engineer. Terminal based. NOTE: Very different from https://gptengineer.app
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and…
AiLearning:数据分析+机器学习实战+线性代数+PyTorch+NLTK+TF2
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Instant voice cloning by MIT and MyShell. Audio foundation model.
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚
Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
OCR, layout analysis, reading order, table recognition in 90+ languages
Awesome pre-trained models toolkit based on PaddlePaddle. (400+ models including Image, Text, Audio, Video and Cross-Modal with Easy Inference & Serving)【安全加固,暂停交互,请耐心等待】
An orchestration platform for the development, production, and observation of data assets.
The official GitHub page for the survey paper "A Survey of Large Language Models".
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Python client for Baidu Yun (Personal Cloud Storage) 百度云/百度网盘Python客户端
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
A sample app for the Retrieval-Augmented Generation pattern running in Azure, using Azure AI Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences.
A Comprehensive Toolkit for High-Quality PDF Content Extraction
A treasure chest for visual classification and recognition powered by PaddlePaddle
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
Silero VAD: pre-trained enterprise-grade Voice Activity Detector