Stars
OCR, layout analysis, reading order, table recognition in 90+ languages
D-FINE: Redefine Regression Task of DETRs as Fine-grained Distribution Refinement [ICLR 2025 Spotlight]
The reinforcement learning training code for AgiBot X1.
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
"LightRAG: Simple and Fast Retrieval-Augmented Generation"
Implementation of Alphafold 3 from Google Deepmind in Pytorch
Labeling tool with SAM(segment anything model),supports SAM, SAM2, sam-hq, MobileSAM EdgeSAM etc.交互式半自动图像标注工具
YOLOv10: Real-Time End-to-End Object Detection [NeurIPS 2024]
[CVPR 2024] Code for SC-GS: Sparse-Controlled Gaussian Splatting for Editable Dynamic Scenes
open Multiple View Geometry library. Basis for 3D computer vision and Structure from Motion.
Algorithm to texture 3D reconstructions from multi-view stereo images
Atlas: End-to-End 3D Scene Reconstruction from Posed Images
Code for "NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video", CVPR 2021 oral
Official implementation of `Splatter Image: Ultra-Fast Single-View 3D Reconstruction' CVPR 2024
This repository introduces an approach to improve the efficiency of unsupervised MVS networks. We achieve this by eliminating the need for a separate cost volume regularization step for neural rend…
Official Pytorch Implementation of SPECTRE: Visual Speech-Aware Perceptual 3D Facial Expression Reconstruction from Videos
Speech To Speech: an effort for an open-sourced and modular GPT4-o
electech6 / openMVS_comments
Forked from cdcseacave/openMVSopen Multi-View Stereo reconstruction library
C++ Library Manager for Windows, Linux, and MacOS
open Multi-View Stereo reconstruction library
A self-supervised learning framework for audio-visual speech
🔥 2D and 3D Face alignment library build using pytorch
Run PyTorch LLMs locally on servers, desktop and mobile
A generative speech model for daily dialogue.
基于大模型的智能对话客服工具,支持微信、拼多多、千牛、哔哩哔哩、抖音企业号、抖音、抖店、微博聊天、小红书专业号运营、小红书、知乎等平台接入,可选择 GPT3.5/GPT4.0/ 懒人百宝箱 (后续会支持更多平台),能处理文本、语音和图片,通过插件访问操作系统和互联网等外部资源,支持基于自有知识库定制企业 AI 应用。