EPCL - AAAI 2024 Oral

EPCL: Frozen CLIP Transformer is An Efficient Point Cloud Encoder, paper

The pretrain-finetune paradigm has achieved great success in NLP and 2D image fields because of the high-quality representation ability and transferability of their pretrained models. However, pretraining such a strong model is difficult in the 3D point cloud field due to the limited amount of point cloud sequences. This paper introduces Efficient Point Cloud Learning (EPCL), an effective and efficient point cloud learner for directly training high-quality point cloud models with a frozen CLIP transformer. Our EPCL connects the 2D and 3D modalities by semantically aligning the image features and point cloud features without paired 2D-3D data. Specifically, the input point cloud is divided into a series of local patches, which are converted to token embeddings by the designed point cloud tokenizer. These token embeddings are concatenated with a task token and fed into the frozen CLIP transformer to learn point cloud representation. The intuition is that the proposed point cloud tokenizer projects the input point cloud into a unified token space that is similar to the 2D images.

Getting Started

In this repository, we have implemented our methods on four tasks: classification, detection, indoor segmentation and outdoor segmentation. If you want to run the corresponding tasks, please refer to the README.md in each task for details:

Citing our work

Please cite the following papers if you use our code:

@article{huangepcl,
  title={EPCL: Frozen CLIP Transformer is An Efficient Point Cloud Encoder},
  author={Xiaoshui Huang, Zhou Huang, Sheng Li, Wentao Qu, Tong He, Yuenan Hou, Yifan Zuo, Wanli Ouyang},
  journal={AAAI},
  year={2024}
}

Acknowledgement

The frozen transformer encoder used in all tasks comes from CLIP.

Part of our implementation uses code from repositories below:

Thank the authors for their great work!

Name		Name	Last commit message	Last commit date
Latest commit History 425 Commits
assets		assets
classification		classification
detection		detection
indoor_segmentation		indoor_segmentation
outdoor_segmentation		outdoor_segmentation
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EPCL - AAAI 2024 Oral

EPCL: Frozen CLIP Transformer is An Efficient Point Cloud Encoder, paper

Getting Started

Citing our work

Acknowledgement

About

Releases

Packages

Languages

DarthIV02/EPCL

Folders and files

Latest commit

History

Repository files navigation

EPCL - AAAI 2024 Oral

EPCL: Frozen CLIP Transformer is An Efficient Point Cloud Encoder, paper

Getting Started

Citing our work

Acknowledgement

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages