Skip to content

😎 A list of awesome scene understanding papers.

License

Notifications You must be signed in to change notification settings

bertjiazheng/awesome-scene-understanding

Folders and files

NameName
Last commit message
Last commit date

Latest commit

381317f Β· Feb 25, 2025
Sep 5, 2022
Feb 25, 2025

Repository files navigation

Awesome Scene Understanding Awesome

A curated list of awesome scene understanding papers, inspired by awesome-computer-vision.

  • πŸ“· Multi-view images
  • 🎲 Point cloud

Related Resources

Workshops and Tutorials

Survey

Papers Venue Links
Neural Fields in Robotics: A Survey arXiv 2024 -
Advances in Data-Driven Analysis and Synthesis of 3D Indoor Scenes CGF 2023 -
State-of-the-art in Automatic 3D Reconstruction of Structured Indoor Environments CGF 2020 [project]
Indoor Scene Understanding in 2.5/3D for Autonomous Agents: A Survey IEEE Access 2019 -
RGBD Datasets: Past, Present and Future CVPR Workshop 2016 [project]

Dataset

Realistic Dataset

Papers Venue Links
ScanNet++: A High-Fidelity Dataset of 3D Indoor Scenes ICCV 2023 [project]
ARKitScenes: A Diverse Real-World Dataset For 3D Indoor Scene Understanding Using Mobile RGB-D Data NeurIPS 2021 Dataset Track [code]
Zillow Indoor Dataset: Annotated Floor Plans With 360˚ Panoramas and 3D Room Layouts CVPR 2021 [code]
HoliCity: A City-Scale Data Platform for Learning Holistic 3D Structures CoRR 2020 [project]
OASIS: A Large-Scale Dataset for Single Image 3D in the Wild CVPR 2020 [project]
3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera ICCV 2019 [project]
The Replica Dataset: A Digital Replica of Indoor Spaces CoRR 2019 [code]
Matterport3D: Learning from RGB-D Data in Indoor Environments 3DV 2017 [project]
Joint 2D-3D-Semantic Data for Indoor Scene Understanding CoRR 2017 [project]
ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes CVPR 2017 [project]
SceneNN: a Scene Meshes Dataset with aNNotations 3DV 2016 [project]
SUN RGB-D: A RGB-D Scene Understanding Benchmark Suite CVPR 2015 [project]
SUN3D: A Database of Big Spaces Reconstructed using SfM and Object Labels ICCV 2013 [project]
Indoor Segmentation and Support Inference from RGBD Images ECCV 2012 [project]

Synthetic Dataset

Papers Venue Links
Infinigen Indoors: Photorealistic Indoor Scenes using Procedural Generation CVPR 2024 [project]
R3DS: Reality-linked 3D Scenes for Panoramic Scene Understanding CoRR 2024 [project]
FurniScene: A Large-scale 3D Room Dataset with Intricate Furnishing Scenes CoRR 2024 -
GeoSynth: A Photorealistic Synthetic Indoor Dataset for Scene Understanding VR 2023 [code]
MINERVAS: Massive INterior EnviRonments VirtuAl Synthesis CGF 2022 [project]
3D-FRONT: 3D Furnished Rooms with layOuts and semaNTics ICCV 2021 [project]
Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding ICCV 2021 [project]
OpenRooms: An End-to-End Open Framework for Photorealistic Indoor Scene Datasets CVPR 2021 [project]
Structured3D: A Large Photo-realistic Dataset for Structured 3D Modeling ECCV 2020 [project]
InteriorNet: Mega-scale Multi-sensor Photo-realistic Indoor Scenes Dataset BMVC 2018 [project]
SceneNet RGB-D: Can 5M Synthetic Images Beat Generic ImageNet Pre-training on Indoor Segmentation? ICCV 2017 [project]
Semantic Scene Completion from a Single Depth Image CVPR 2017 -
SceneNet: Understanding Real World Indoor Scenes With Synthetic Data CVPR 2016 [project]
The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes CVPR 2016 [project]

Holistic Scene Understanding

Perspective Image

Papers Venue Links
Single-view 3D Scene Reconstruction with High-fidelity Shape and Texture 3DV 2024 [project]
Towards High-Fidelity Single-view Holistic Reconstruction of Indoor Scenes ECCV 2022 [code]
Holistic 3D Scene Understanding from a Single Image with Implicit Representation CVPR 2021 [project] [code]
Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image CVPR 2020 [code]
PerspectiveNet: 3D Object Detection from a Single RGB Image via Perspective Points NeurIPS 2019 -
Hoilistc++ Scene Understanding: Single-view 3D Holistic Scene Parsing and Human Pose Estimation with Human-Object Interaction and Physical Commonsense ICCV 2019 [project] [code]
Complete 3D Scene Parsing from an RGBD Image IJCV 2018 -
Cooperative Holistic Scene Understanding: Unifying 3D Object, Layout, and Camera Pose Estimation NeurIPS 2018 [project] [code]
Holistic 3D Scene Parsing and Reconstruction from a Single RGB Image ECCV 2018 [project] [code]
Factoring Shape, Pose, and Layout from the 2D Image of a 3D Scene CVPR 2018 [project] [code]
Im2CAD CVPR 2018 [project]
DeepContext: Context-Encoding Neural Pathways for 3D Holistic Scene Understanding ICCV 2017 [project]
Emptying, Refurnishing, and Relighting Indoor Spaces SIGGRAPH Asia 2016 [project]
Scene Parsing by Integrating Function, Geometry and Appearance Models CVPR 2013 -
Understanding Indoor Scenes using 3D Geometric Phrases (CVPR 2013) -
Recovering Free Space of Indoor Scenes from a Single Image CVPR 2012 -
Efficient Exact Inference for 3D Indoor Scene Understanding ECCV 2012 -
Efficient Structured Prediction for 3D Indoor Scene Understanding CVPR 2012 -
Estimating Spatial Layout of Rooms using Volumetric Reasoning about Objects and Surfaces NeurIPS 2010 -
Thinking Inside the Box: Using Appearance Models and Context Based on Room Geometry ECCV 2010 -

Panoramic Image

Papers Venue Links
PanoContext-Former: Panoramic Total Scene Understanding with a Transformer CVPR 2024 -
PanelNet: Understanding 360 Indoor Environment via Panel Representation CVPR 2023 -
DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization ICCV 2021 [code]
HoHoNet: 360 Indoor Holistic Understanding with Latent Horizontal Features CVPR 2021 [Code]
Automatic 3D Indoor Scene Modeling from Single Panorama CVPR 2018 -
Pano2CAD: Room Layout From A Single Panorama Image WACV 2017 -
PanoContext: A Whole-room 3D Context Model for Panoramic Scene Understanding ECCV 2014 [project]

Room Layout Estimation

Perspective Image

(AW: Atlanta-world, SS: single-floor and single-ceiling, PP: Piece-wise Planarity.)

Dataset Year Modality #Frames Prior Source
CAD-Estate 2023 RGB Video - Generic RealEstate-10K
Matterport3D-Layout 2020 RGB-D 7360 PP Matterport
ScanNet-Layout 2020 RGB-D 293 PP ScanNet
Structured3D 2020 RGB-D 82027 AW+SS Structured3D
LSUN Room Layout 2016 RGB 5394 Cuboid SUN
SUN RGB-D 2015 RGB-D 10335 AW+SS NYUv2, Berkeley B3DO, and SUN3D
NYUv2 303 2013 RGB-D 303 Cuboid NYUv2
Hedau 2009 RGB 366 Cuboid -
Papers Venue Links
πŸ“· Unposed Sparse Views Room Layout Reconstruction in the Age of Pretrain Model ICLR 2025 -
Polygon Detection for Room Layout Estimation using Heterogeneous Graphs and Wireframes ICCV Workshop 2023 [code]
ST-RoomNet: Learning Room Layout Estimation From Single Image Through Unsupervised Spatial Transformations CVPR Workshop 2023 -
Learning to Reconstruct 3D Non-Cuboid Room Layout from a Single RGB Image WACV 2022 [code]
RoomStructNet: Learning to Rank Non-Cuboidal Room Layouts From Single View CoRR 2021 -
GeoLayout: Geometry Driven Room Layout Estimation Based on Depth Maps of Planes ECCV 2020 [Matterport3D Layout Dataset]
Structural Deep Metric Learning for Room Layout Estimation ECCV 2020 -
General 3D Room Layout from a Single View by Render-and-Compare ECCV 2020 [project] [ScanNet-Layout Dataset] [code]
Smart Hypothesis Generation for Efficient and Robust Room Layout Estimation WACV 2020 -
Flat2Layout: Flat Representation for Estimating Layout of General Room Types CoRR 2019 -
Thinking Outside the Box: Generation of Unconstrained 3D Room Layouts ACCV 2018 -
RoomNet: End-to-End Room Layout Estimation ICCV 2017 -
Physics Inspired Optimization on Semantic Transfer Features: An Alternative Method for Room Layout Estimation CVPR 2017 [project]
A Coarse-to-Fine Indoor Layout Estimation (CFILE) Method ACCV 2016 -
DeLay: Robust Spatial Layout Estimation for Cluttered Indoor Scenes CVPR 2016 -
Learning Informative Edge Maps for Indoor Scene Layout Prediction ICCV 2015 -
Rent3D: Floor-Plan Priors for Monocular Layout Estimation CVPR 2015 [project]
Box In the Box: Joint 3D Layout and Object Reasoning from Single Images CVPR 2013 -
Estimating the 3D Layout of Indoor Scenes and its Clutter from Depth Sensors ICCV 2013 [project]
Recovering the Spatial Layout of Cluttered Rooms ICCV 2009 -

Panoramic Image

(MW: Manhattan world, AW: Atlanta world, SS: single-floor and single-ceiling.)

Dataset Year Modality #Frames Prior Source
ZInD 2021 RGB 71474 AW+SS ZinD
MatterportLayout 2020 RGB-D 2295 MW+SS Matterport
Structured3D 2020 RGB-D 196515 AW+SS Structured3D
LayoutMP3D 2020 RGB-D 2505 MW+SS Matterport
2D-3D-S 2018 RGB-D 571 Cuboid 2D-3D-S
PanoContext 2014 RGB 500 Cuboid SUN360
Papers Venue Links
uLayout: Unified Room Layout Estimation for Perspective and Panoramic Images WACV 2025 -
No More Ambiguity in 360β—¦ Room Layout via Bi-Layout Estimation CVPR 2024
Seg2Reg: Differentiable 2D Segmentation to 1D Regression Rendering for 360 Room Layout Reconstruction CVPR 2024
iBARLE: imBalance-Aware Room Layout Estimation CoRR 2023
πŸ“· GPR-Net: Multi-view Layout Estimation via a Geometry-aware Panorama Registration Network CVPR Workshop 2023 -
Shape-Net: Room Layout Estimation from Panoramic Images Robust to Occlusion using Knowledge Distillation with 3D Shapes as Additional Inputs CVPR Workshop 2023
U2RLE: Uncertainty-Guided 2-Stage Room Layout Estimation CVPR 2023
Disentangling Orthogonal Planes for Indoor Panoramic Room Layout Estimation with Cross-Scale Distortion Awareness CVPR 2023 [Code]
πŸ“· 360-MLC: Multi-view Layout Consistency for Self-training and Hyper-parameter Tuning NeurIPS 2022 [Project]
3D Room Layout Estimation from a Cubemap of Panorama Image via Deep Manhattan Hough Transform ECCV 2022 [Code]
3D Room Layout Recovery Generalizing across Manhattan and Non-Manhattan Worlds CVPR 2022 -
πŸ“· PSMNet: Position-aware Stereo Merging Network for Room Layout Estimation CVPR 2022 [code]
Self-supervised 360˚ Room Layout Estimation CoRR 2022 [code]
LGT-Net: Indoor Panoramic Room Layout Estimation with Geometry-Aware Transformer Network CVPR 2022 -
Deep3DLayout: 3D Reconstruction of an Indoor Layout from a Spherical Panoramic Image SIGGRAPH Asia 2021 [project]
Transferable End-to-end Room Layout Estimation via Implicit Encoding CoRR 2021 [project]
OmniLayout: Room Layout Reconstruction from Indoor Spherical Panoramas CVPR Workshop 2021 [code]
LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering CVPR 2021 [project] [code]
SSLayout360: Semi-Supervised Indoor Layout Estimation from 360 Panorama CVPR 2021 -
Single-Shot Cuboids: Geodesics-based End-to-end Manhattan Aligned Layout Estimation from Spherical Panoramas Image and Vision Computing 2021 [project] [code]
Manhattan Room Layout Reconstruction from a Single 360 image: A Comparative Study of State-of-the-art Methods IJCV 2021 [code] [MatterportLayout Dataset]
Training and Post Processing 3D Room Layout Beyond the Manhattan World Assumption ECCV Workshop 2020 -
Joint 3D Layout and Depth Prediction from a Single Indoor Panorama Image ECCV 2020 -
AtlantaNet: Inferring the 3D Indoor Layout from a Single 360 Image Beyond the Manhattan World Assumption ECCV 2020 [project] [code]
Corners for Layout: End-to-End Layout Recovery from 360 Images ICRA 2019 [project] [code]
DuLa-Net: A Dual-Projection Network for Estimating Room Layouts from a Single RGB Panorama CVPR 2019 [project]
HorizonNet: Learning Room Layout with 1D Representation and Pano Stretch Data Augmentation CVPR 2019 [code]
Layouts from Panoramic Images with Geometry and Deep Learning IROS 2018 [code]
LayoutNet: Reconstructing the 3D Room Layout from a Single RGB Image (CVPR 2018) [code]
Efficient 3D Room Shape Recovery From a Single Panorama CVPR 2016 [code]

Floorplan

Papers Venue Links
🎲 FRI-Net: Floorplan Reconstruction via Room-wise Implicit Representation ECCV 2024 [code]
🎲 PolyRoom: Room-aware Transformer for Floorplan Reconstruction ECCV 2024 [code]
🎲 PolyDiffuse: Polygonal Shape Reconstruction via Guided Set Diffusion Models NeurIPS 2023 [project]
🎲 Connecting the Dots: Floorplan Reconstruction Using Two-Level Queries CVPR 2023 [project] [code]
πŸ“· Floorplan Restoration by Structure Hallucinating Transformer Cascades CoRR 2022 -
πŸ“· MVLayoutNet: 3D Layout Reconstruction with Multi-View Panoramas CoRR 2021 -
πŸ“· Extreme Structure From Motion for Indoor Panoramas Without Visual Overlaps ICCV 2021 [code]
🎲 MonteFloor: Extending MCTS for Reconstructing Accurate Large-Scale Floor Plans ICCV 2021 -
🎲 Scan2Plan: Efficient Floorplan Generation from 3D Scans of Indoor Scenes CoRR 2020 -
🎲 Floor-SP: Inverse CAD for Floorplans by Sequential Room-wise Shortest Path ICCV 2019 [project] [code]
πŸ“· Floorplan-Jigsaw: Jointly Estimating Scene Layout and Aligning Partial Scans ICCV 2019 [project]
🎲 DeepPerimeter: Indoor Boundary Estimation from Posed Monocular Sequences CoRR 2019 -
πŸ“· FloorNet: A unified framework for floorplan reconstruction from 3D scans ECCV 2018 [project] [code]

Floorplan Vectorization

Papers Venue Links
VectorFloorSeg: Two-Stream Graph Attention Network for Vectorized Roughcast Floorplan Segmentation CVPR 2023 [code]
Parsing Line Segments of Floor Plan Images Using Graph Neural Networks CoRR 2023 -
Residential floor plan recognition and reconstruction CVPR 2021 -
Versailles-FP dataset: Wall Detection in Ancient Floor Plans CoRR 2021 -
Deep Floor Plan Recognition using a Multi-task Network with Room-boundary-Guided Attention ICCV 2019 [project]
CubiCasa5K: A Dataset and an Improved Multi-Task Model for Floorplan Image Analysis Scandinavian Conference on Image Analysis 2019 [code]
Raster-to-Vector: Revisiting Floorplan Transformation ICCV 2017 [project] [code]

Visual Localization

Papers Venue Links
SPVLoc: Semantic Panoramic Viewport Matching for 6D Camera Localization in Unseen Environments ECCV 2024 [project] [code]
LaLaLoc++: Global Floor Plan Comprehension for Layout Localisation in Unvisited Environments ECCV 2022 [code]
LASER: LAtent SpacE Rendering for 2D Visual Localization CVPR 2022 -
LaLaLoc: Latent Layout Localisation in Dynamic, Unvisited Environments ICCV 2021 -

Primitive

Junction

Papers Venue Links
Manhattan Junction Catalogue for Spatial Reasoning of Indoor Scenes CVPR 2013 -

Line Segment and Wireframe

Papers Venue Links
πŸ“·Volumetric Wireframe Parsing from Neural Attraction Fields CoRR 2023 [code]
πŸ“·NEF: Neural Edge Fields for 3D Parametric Curve Reconstruction from Multi-view Images CVPR 2023 [project]
DeepLSD: Line Segment Detection and Refinement with Deep Image Gradients CoRR 2022 [Code]
Holistically-Attracted Wireframe Parsing: From Supervised to Self-Supervised Learning CoRR 2022 -
🎲Learning to Construct 3D Building Wireframes from 3D Line Clouds BMVC 2022 [Code]
HoW-3D: Holistic 3D Wireframe Perception from a Single Image 3DV 2022 [Code]
Semantic Room Wireframe Detection from a Single View ICPR 2022 [code]
Towards Real-time and Light-weight Line Segment Detection AAAI 2022 [code]
Hole-robust Wireframe Detection WACV 2022 -
Fully Convolutional Line Parsing Neurocomputing 2022 [code]
ELSD: Efficient Line Segment Detector and Descriptor ICCV 2021 -
SOLD2: Self-supervised Occlusion-aware Line Description and Detection CVPR 2021 [code]
Line Segment Detection Using Transformers without Edges CVPR 2021 [code]
PlueckerNet: Learn to Register 3D Line Reconstructions CVPR 2020 [code]
LGNN: A Context-aware Line Segment Detector ACM MM 2020 -
TP-LSD: Tri-Points Based Line Segment Detector ECCV 2020 [code]
Deep Hough-Transform Line Priors ECCV 2020 [code]
Deep Hough Transform for Semantic Line Detection ECCV 2020 [code]
Holistically-Attracted Wireframe Parsing CVPR 2020 [code]
Learning to Reconstruct 3D Manhattan Wireframes from a Single Image ICCV 2019 [code]
End-to-End Wireframe Parsing ICCV 2019 [code]
PPGNet: Learning Point-Pair Graph for Line Segment Detection CVPR 2019 [code]
Learning Attraction Field Representation for Robust Line Segment Detection CVPR 2019 [code]
Novel Single View Constraints for Manhattan 3D Line Reconstruction 3DV 2018 -
Learning to Parse Wireframes in Images of Man-Made Environments CVPR 2018 [code]
A Novel Linelet-Based Representation for Line Segment Detection TPAMI 2018 -
MCMLSD: A Dynamic Programming Approach to Line Segment Detection CVPR 2017 -
Lifting 3D Manhattan Lines from a Single Image ICCV 2013 -
LSD: A Fast Line Segment Detector with a False Detection Control TPAMI 2010 -

Outdoor Architecture

Papers Venue Links
HEAT: Holistic Edge Attention Transformer for Structured Reconstruction CVPR 2022 [Project]
Structured Outdoor Architecture Reconsruction by Exploration and Classification ICCV 2021 [Project]
Roof-GAN: Learning to Generate Roof Geometry and Relations for Residential Houses CVPR 2021 [Code]
Vectorizing World Buildings: Planar Graph Reconstruction by Primitive Detection and Relationship Inference ECCV 2020 [Project]
Conv-MPN: Convolutional Message Passing Neural Network for Structured Outdoor Architecture Reconstruction CVPR 2020 [Project]

Plane

Papers Venue Links
MonoPlane: Exploiting Monocular Geometric Cues for Generalizable 3D Plane Reconstruction IROS 2024 [code]
πŸ“· UniPlane: Unified Plane Detection and Reconstruction from Posed Monocular Videos CoRR 2024
πŸ“· AirPlanes: Accurate Plane Estimation via 3D-Consistent Embeddings CVPR 2024 [project]
PlaneRecTR: Unified Query learning for 3D Plane Recovery from a Single View ICCV 2023 [Code]
πŸ“· NOPE-SAC: Neural One-Plane RANSAC for Sparse-View Planar 3D Reconstruction CoRR 2022 [Code]
πŸ“· PlaneFormers: From Sparse View Planes to 3D Reconstruction ECCV 2022 [project] [code]
πŸ“· PlanarRecon: Real-time 3D Plane Detection and Reconstruction from Posed Monocular Videos CVPR 2022 [Project]
PlaneRecNet: Multi-Task Learning with Cross-Task Consistency for Piece-Wise Plane Detection and Reconstruction from a Single RGB Image BMVC 2021 [code]
PlaneTR: Structure-Guided Transformers for 3D Plane Recovery ICCV 2021 [code]
πŸ“· Planar Surface Reconstruction From Sparse Views ICCV 2021 [project] [code]
Indoor Panorama Planar 3D Reconstruction via Divide and Conquer CVPR 2021 [code]
Learning Pairwise Inter-Plane Relations for Piecewise Planar Reconstruction ECCV 2020 [code]
Peek-a-Boo: Occlusion Reasoning in Indoor Scenes with Plane Representations CVPR 2020 [project]
Single-Image Piece-wise Planar 3D Reconstruction via Associative Embedding CVPR 2019 [code]
PlaneRCNN: 3D Plane Detection and Reconstruction from a Single Image CVPR 2019 [project] [code]
Recovering 3D Planes from a Single Image via Convolutional Neural Networks ECCV 2018 [code]
PlaneNet: Piece-wise Planar Reconstruction from a Single RGB Image CVPR 2018 [project] [code]

Vanishing Point

Papers Venue Links
Vanishing Point Estimation in Uncalibrated Images with Prior Gravity Direction ICCV 2023 [code]
Transformer Based Line Segment Classifier with Image Context for Real-Time Vanishing Point Detection in Manhattan World CVPR 2022 -
Deep Vanishing Point Detection: Geometric Priors Make Dataset Variations Vanish CVPR 2022 -
VaPiD: A Rapid Vanishing Point Detector via Learned Optimizers ICCV 2021 -
NeurVPS: Neural Vanishing Point Scanning via Conic Convolution NeurIPS 2021 [Code]