Skip to content

Code for paper `Evolving Symbolic 3D Visual Grounder with Weakly Supervised Reflection`.

License

Notifications You must be signed in to change notification settings

OpenRobotLab/EaSe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Evolving Symbolic 3D Visual Grounder with Weakly Supervised Reflection

Code for paper Evolving Symbolic 3D Visual Grounder with Weakly Supervised Reflection.

Environment Installation

pip install -r requirements.txt

Set your openai api key:

export OPENAI_API_KEY=your_api_key

Data Preparation

The data/ dir should be organized as follows:

data
├── frames
│   ├── color
│   │   ├── 0.png
│   │   ├── 20.png
│   │   └── ...
├── referit3d
│   ├── annotations
│   ├── scan_data
├── symoblic_exp
│   ├── nr3d.jsonl
│   ├── scanrefer.json
├── test_data
│   ├── above
│   ├── behind
│   ├── ...
├── seg
├── nr3d_masks
├── scanrefer_masks
├── feats_3d.pkl
├── tables.pkl

  • frames: RGB images of the scenes. download_link
  • referit3d: processed referit3d dataset from vil3dref
  • symbolic_exp: symbolic expressiones.
  • test_data: test data for code generation.
  • seg: segmentation results of 3D point clouds for ScanRefer. download_link
  • nr3d_masks: 2D GT object masks. download_link
  • scanrefer_masks: 2D predicted object masks. download_link
  • feats_3d.pkl: predicted object labels for Nr3D from ZSVG3D
  • tables.pkl: tables for code generation. download_link

(Optional) Relation Encoder Generation

Run src/relation_encoders/run_optim.py to generate relation encoders for 6 relations: left, right, between, corner, above, below, behind.

After the optimization is done, you will get the relation encoders and their accuracy on test cases under data/test_data/{relation_name}/trajs. Then you can select the best relation encoders for evaluation. You can also use the provided relation encoders in src/relation_encoders.

(Optional) Features Computation

python -m src.relation_encoders.compute_features \
    --dataset scanrefer \
    --output $OUTPUT_DIR \
    --label pred

--dataset option can be scanrefer or nr3d. The --label option can be gt or pred. Now we only support the pred label for ScanRefer because there is no GT label in standard evaluation protocols.

After running, you will get features in .pth format in the $OUTPUT_DIR directory.

You can also download our prepared features: nr3d(pred label) nr3d(gt label) scanrefer

Evaluation

Nr3d Evaluation:

python -m src.eval.eval_nr3d \
    --features_path output/nr3d_features_per_scene_pred_label.pth \
    --top_k 5 \
    --threshold 0.9 \
    --label_type pred \
    --use_vlm 

ScanRefer Evaluation:

python -m src.eval.eval_scanrefer \
    --features_path output/scanrefer_features_per_scene.pth \
    --top_k 5 \
    --threshold 0.1 \
    --use_vlm

Change features_path and label_type if you'd like to evaluate on the ground truth labels. Set --use_vlm, --top_k and threshold to use the VLM model for evaluation. Please refer to our paper for the meanings of these parameters.

Acknowledgement

Thank following repositories for their contributions:

Awesome Concurrent Works

About

Code for paper `Evolving Symbolic 3D Visual Grounder with Weakly Supervised Reflection`.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published