Stars
[CVPR 2025] Code for Segment Any Motion in Videos
SynCity: Training-Free Generation of 3D Worlds
Stable Virtual Camera: Generative View Synthesis with Diffusion Models
[CVPR 2025 Oral] VGGT: Visual Geometry Grounded Transformer
[CVPR 2025] Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass
[CVPR 2025] Sparse Voxels Rasterization: Real-time High-fidelity Radiance Field Rendering
PyTorch implementation of FractalGen https://arxiv.org/abs/2502.17437
Pippo: High-Resolution Multi-View Humans from a Single Image
Fully open reproduction of DeepSeek-R1
MEt3R: Measuring Multi-View Consistency in Generated Images
Agent Laboratory is an end-to-end autonomous research workflow meant to assist you as the human researcher toward implementing your research ideas
FreeSplatter: Pose-free Gaussian Splatting for Sparse-view 3D Reconstruction
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
Efficient vision foundation models for high-resolution generation and perception.
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model. NeurIPS 2024 Spotlight.
[NeurIPS 2024] MVInpainter: Learning Multi-View Consistent Inpainting to Bridge 2D and 3D Editing
Estimating Body and Hand Motion in an Ego-sensed World
[ECCV 2024 - ORAL] Official PyTorch implementation of Gaussian Frosting: Editable Complex Radiance Fields with Real-Time Rendering
Optimized implementation for color-icon-matrix barcodes
Python package for retrieving current and historical photos from Google Street View
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑🔬
InstantSplat: Sparse-view SfM-free Gaussian Splatting in Seconds
PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838
[NeurIPS 2024] MeshXL: Neural Coordinate Field for Generative 3D Foundation Models, a 3D fundamental model for mesh generation
[CVPR 2024 Highlight] XCube: Large-Scale 3D Generative Modeling using Sparse Voxel Hierarchies