Installation

Our code is based on the following environment.

1. Create conda environment

conda create -n embodiedocc python=3.8.19
conda activate embodiedocc

2. Install PyTorch

pip install torch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 --index-url https://download.pytorch.org/whl/cu113

3. Install some packages following GaussianFormer

1. Install packages from MMLab

pip install openmim==0.3.9
mim install mmcv==2.0.1
mim install mmdet==3.0.0
mim install mmsegmentation==1.2.2
mim install mmdet3d==1.1.1

2. Install other packages

pip install spconv-cu114==2.3.6
pip install timm
pip install vtk==9.0.1

3. Install custom CUDA ops

git clone --recursive https://github.com/YkiWu/EmbodiedOcc.git
cd EmbodiedOcc
cd model/encoder/gaussianformer/ops && pip install -e .
cd model/head/gaussian_occ_head/ops/localagg && pip install -e .

4. Install the additional dependencies

cd EmbodiedOcc
pip install -r requirements.txt

5. Download Depth-Anything-V2 and make some slight changes

cd EmbodiedOcc
git clone https://github.com/DepthAnything/Depth-Anything-V2.git

Folder structure

EmbodiedOcc
├── ...
├── Depth-Anything-V2

Go to Depth-Anything-V2/metric_depth/depth_anything_v2/dpt.py and change the function infer_image in the class DepthAnythingV2 as follows:

def infer_image(self, image, h_, w_, input_size=518):
    depth = self.forward(image)
    depth = F.interpolate(depth[:, None], (h_, w_), mode="bilinear", align_corners=True)[0, 0]
    return depth

6. Download EfficientNet-Pytorch

cd EmbodiedOcc
git clone https://github.com/lukemelas/EfficientNet-PyTorch.git

Folder structure

EmbodiedOcc
├── ...
├── Depth-Anything-V2
├── EfficientNet-Pytorch

7. Download our finetuned checkpoint of Depth-Anything-V2 on Occ-ScanNet and put it under the checkpoints

Folder structure

EmbodiedOcc
├── ...
├── checkpoints/
│   ├── finetune_scannet_depthanythingv2.pth

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!