Our code is based on the following environment.
conda create -n embodiedocc python=3.8.19
conda activate embodiedocc
pip install torch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 --index-url https://download.pytorch.org/whl/cu113
3. Install some packages following GaussianFormer
pip install openmim==0.3.9
mim install mmcv==2.0.1
mim install mmdet==3.0.0
mim install mmsegmentation==1.2.2
mim install mmdet3d==1.1.1
pip install spconv-cu114==2.3.6
pip install timm
pip install vtk==9.0.1
git clone --recursive https://github.com/YkiWu/EmbodiedOcc.git
cd EmbodiedOcc
cd model/encoder/gaussianformer/ops && pip install -e .
cd model/head/gaussian_occ_head/ops/localagg && pip install -e .
cd EmbodiedOcc
pip install -r requirements.txt
cd EmbodiedOcc
git clone https://github.com/DepthAnything/Depth-Anything-V2.git
Folder structure
EmbodiedOcc
├── ...
├── Depth-Anything-V2
Go to Depth-Anything-V2/metric_depth/depth_anything_v2/dpt.py and change the function infer_image in the class DepthAnythingV2 as follows:
def infer_image(self, image, h_, w_, input_size=518):
depth = self.forward(image)
depth = F.interpolate(depth[:, None], (h_, w_), mode="bilinear", align_corners=True)[0, 0]
return depth
cd EmbodiedOcc
git clone https://github.com/lukemelas/EfficientNet-PyTorch.git
Folder structure
EmbodiedOcc
├── ...
├── Depth-Anything-V2
├── EfficientNet-Pytorch
7. Download our finetuned checkpoint of Depth-Anything-V2 on Occ-ScanNet and put it under the checkpoints
Folder structure
EmbodiedOcc
├── ...
├── checkpoints/
│ ├── finetune_scannet_depthanythingv2.pth