-
TU Munich
- Munich, Germany
- www.ymxlzgy.com
Lists (6)
Sort Name ascending (A-Z)
Stars
[CVPR'25 Oral] MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision
Official code for "LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias" (ICLR 2025 Oral)
[TMLR 2025🔥] A survey for the autoregressive models in vision.
[NeurIPS 2024] Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single Image
[CVPR 2025] MIDI: Multi-Instance Diffusion for Single Image to 3D Scene Generation
Official implementation of Continuous 3D Perception Model with Persistent State
[CVPR 2025] Source codes for the paper "3D-Mem: 3D Scene Memory for Embodied Exploration and Reasoning"
[ICLR 2025] Semi-Supervised Vision-Centric 3D Occupancy World Model for Autonomous Driving
New repo collection for NVIDIA Cosmos: https://github.com/nvidia-cosmos
[AAAI 2025]MMGDreamer: Mixed-Modality Graph for Geometry-Controllable 3D Indoor Scene Generation
A Benchmark for Low-Level Manipulation in Home Rearrangement Tasks
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
This is the official impletations of the EMNLP Findings paper, VideoINSTA: Zero-shot Long-Form Video Understanding via Informative Spatial-Temporal Reasoning
[CoRL 2024] RoboEXP: Action-Conditioned Scene Graph via Interactive Exploration for Robotic Manipulation
Code for training embodied agents using imitation learning at scale in Habitat-Lab
SAPIEN Manipulation Skill Framework, an open source GPU parallelized robotics simulator and benchmark, led by Hillbot, Inc.
QuadWBG: Generalizable Quadrupedal Whole-Body Grasping
A Framework for Open-Vocabulary Object Retrieval and Drawer Manipulation in Point Clouds
Code for paper "Grounding Video Models to Actions through Goal Conditioned Exploration".
[ECCV 2024] EchoScene: Indoor Scene Generation via Information Echo over Scene Graph Diffusion.
Clarity: A Minimalist Website Template for AI Research
Pandora: Towards General World Model with Natural Language Actions and Video States
A beautiful, simple, clean, and responsive Jekyll theme for academics
Code for MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D World
[NeurIPS 2024] MeshXL: Neural Coordinate Field for Generative 3D Foundation Models, a 3D fundamental model for mesh generation