Stars
Comprehensive Gradio WebUI for audio processing, powered by Whisper engines (Whisper, Faster-Whisper, Whisper-Timestamped). Features Voice Changer(RVC), zero-shot Voice Cloning (E2, F5-TTS), YouTub…
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
A framework for 4D reconstruction from monocular videos.
open Multi-View Stereo reconstruction library
📄 A curated list of awesome .cursorrules files
Convert ebooks to audiobooks with chapters and metadata using dynamic AI models and voice cloning. Supports 1,107+ languages!
Storybook is the industry standard workshop for building, documenting, and testing UI components in isolation
Introducing "Homben AI," Sri Lanka's leading AI platform that uses advanced algorithms to identify cow faces and locate stolen cows.🐮
[ECCV 2024 Oral] PetFace: A Large-Scale Dataset and Benchmark for Animal Identification https://arxiv.org/abs/2407.13555
llm deploy project based mnn. This project has merged into MNN.
开箱即用的JAVA AI 图片、视频语音识别&OCR平台AI合集包含旦不仅限于(车牌识别、安全帽识别、开门关门、常用类物识别等) 图片和视频识别 可自主 融合了AI图像识别opencv、yolo、ocr、esayAI内核识别;AI智能客服、AI语言模型、 无任何第三方API接口可定制化自主离线化部署并自主化行业化使用 避免占用内存、GPU消耗训练与识别分开使用;
High-Resolution 3D Assets Generation with Large Scale Hunyuan3D Diffusion Models.
AdelaiDet is an open source toolbox for multiple instance-level detection and recognition tasks.
🚀🚀 「大模型」50分钟完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 50 min!
A way to rectify curve text images using spatial transformer by pairs of points.
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
KAG is a logical form-guided reasoning and retrieval framework based on OpenSPG engine and LLMs. It is used to build logical reasoning and factual Q&A solutions for professional domain knowledge ba…
PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/Docker/Zotero
DINO-X: The World's Top-Performing Vision Model for Open-World Object Detection and Understanding
Arbitrary-steps Image Super-resolution via Diffusion Inversion
A Android Library for YOLOv5/YOLOv7/YOLOv8 Detection and Pose Inference Based on NCNN
[ECCV 2024] Official implementation of the paper "X-Pose: Detecting Any Keypoints"
An extremely fast Python package and project manager, written in Rust.
YOLOV8Pose's NCNN Android Optimization Deployment