codenewww (codenewww) / Repositories

Code of the paper "CLIP-Guided Vision-Language Pre-training for Question Answering in 3D Scenes" (CVPRW 2023)

Python Updated Mar 26, 2024

Python Other Updated Mar 20, 2024

[CVPR2022 Oral] 3DJCG: A Unified Framework for Joint Dense Captioning and Visual Grounding on 3D Point Clouds

Python Other Updated Dec 19, 2023

[ICLR 2023] SQA3D for embodied scene understanding and reasoning

Python Apache License 2.0 Updated Dec 6, 2023

Forked from MILVLG/mcan-vqa

Deep Modular Co-Attention Networks for Visual Question Answering

Python Apache License 2.0 Updated Nov 16, 2023

Provide feedback