Evolving Symbolic 3D Visual Grounder with Weakly Supervised Reflection

Boyu Mi, Hanqing Wang, Tai Wang, Yilun Chen, Jiangmiao Pang

Shanghai AI Laboratory

Environment Installation

pip install -r requirements.txt

Set your openai api key:

export OPENAI_API_KEY=your_api_key

Data Preparation

The data/ dir should be organized as follows:

data
├── frames
│   ├── color
│   │   ├── 0.png
│   │   ├── 20.png
│   │   └── ...
├── referit3d
│   ├── annotations
│   ├── scan_data
├── symoblic_exp
│   ├── nr3d.jsonl
│   ├── scanrefer.json
├── test_data
│   ├── above
│   ├── behind
│   ├── ...
├── seg
├── nr3d_masks
├── scanrefer_masks
├── feats_3d.pkl
├── tables.pkl

frames: RGB images of the scenes. download_link
referit3d: processed referit3d dataset from vil3dref
symbolic_exp: symbolic expressiones.
test_data: test data for code generation.
seg: segmentation results of 3D point clouds for ScanRefer. download_link
nr3d_masks: 2D GT object masks. download_link
scanrefer_masks: 2D predicted object masks. download_link
feats_3d.pkl: predicted object labels for Nr3D from ZSVG3D
tables.pkl: tables for code generation. download_link

huggingface dataset

(Optional) Relation Encoder Generation

Run src/relation_encoders/run_optim.py to generate relation encoders for 6 relations: left, right, between, corner, above, below, behind.

After the optimization is done, you will get the relation encoders and their accuracy on test cases under data/test_data/{relation_name}/trajs. Then you can select the best relation encoders for evaluation. You can also use the provided relation encoders in src/relation_encoders.

(Optional) Features Computation

python -m src.relation_encoders.compute_features \
    --dataset scanrefer \
    --output $OUTPUT_DIR \
    --label pred

--dataset option can be scanrefer or nr3d. The --label option can be gt or pred. Now we only support the pred label for ScanRefer because there is no GT label in standard evaluation protocols.

After running, you will get features in .pth format in the $OUTPUT_DIR directory.

You can also download our prepared features: nr3d(pred label) nr3d(gt label) scanrefer

Evaluation

Nr3d Evaluation:

python -m src.eval.eval_nr3d \
    --features_path output/nr3d_features_per_scene_pred_label.pth \
    --top_k 5 \
    --threshold 0.9 \
    --label_type pred \
    --use_vlm

ScanRefer Evaluation:

python -m src.eval.eval_scanrefer \
    --features_path output/scanrefer_features_per_scene.pth \
    --top_k 5 \
    --threshold 0.1 \
    --use_vlm

Change features_path and label_type if you'd like to evaluate on the ground truth labels. Set --use_vlm, --top_k and threshold to use the VLM model for evaluation. Please refer to our paper for the meanings of these parameters.

Acknowledgement

Thank following repositories for their contributions:

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
assets		assets
data		data
scripts		scripts
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Evolving Symbolic 3D Visual Grounder with Weakly Supervised Reflection

Environment Installation

Data Preparation

(Optional) Relation Encoder Generation

(Optional) Features Computation

Evaluation

Acknowledgement

Awesome Concurrent Works

About

Uh oh!

Releases

Packages

Languages

License

OpenRobotLab/EaSe

Folders and files

Latest commit

History

Repository files navigation

Evolving Symbolic 3D Visual Grounder with Weakly Supervised Reflection

Environment Installation

Data Preparation

(Optional) Relation Encoder Generation

(Optional) Features Computation

Evaluation

Acknowledgement

Awesome Concurrent Works

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages