Yongkang Li1,*, Tianheng Cheng1,*, Wenyu Liu1, Xinggang Wang1,📧
1 Huazhong University of Science and Technology,
(* equal contribution, 📧 corresponding author)
-
Mask-Adapter is a simple yet remarkably effective method and can be seamlessly integrated into open-vocabulary segmentation methods, e.g., FC-CLIP and MAFT-Plus, to tackle the existing bottlenecks.
-
Mask-Adapter effectively extends to SAM without training, achieving impressive results across multiple open-vocabulary segmentation benchmarks.
- Release code
- Release weights
- Release demo with SAM-2👉 🤗 Mask-Adapter
- Release weights training with addtional data
Model | Backbone | A-847 | A-150 | PC-459 | PC-59 | PAS-20 | Download |
---|---|---|---|---|---|---|---|
FC-CLIP | ConvNeXt-L | 14.8 | 34.1 | 18.2 | 58.4 | 95.4 | model |
FC-CLIP + Mask-Adapter | ConvNeXt-L | 14.1 | 36.6 | 19.3 | 59.7 | 95.5 | model |
MAFTP-Base | ConvNeXt-B | 13.8 | 34.5 | 18.5 | 57.5 | 95.5 | model |
MAFTP-Base + Mask-Adapter | ConvNeXt-B | 14.2 | 35.6 | 17.9 | 58.4 | 95.1 | model |
MAFTP-Large | ConvNeXt-L | 15.5 | 36.3 | 21.2 | 59.5 | 96.4 | model |
MAFTP-Large + Mask-Adapter | ConvNeXt-L | 16.2 | 38.2 | 22.7 | 60.4 | 95.8 | model |
If you Mask-Adapter useful in your research or applications, please consider giving us a star 🌟 and citing it by the following BibTeX entry.
@article{li2024maskadapter,
title={Mask-Adapter: The Devil is in the Masks for Open-Vocabulary Segmentation},
author={Yongkang Li and Tianheng Cheng and Wenyu Liu and Xinggang Wang},
year={2024},
eprint={2412.04533},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2412.04533},
}
All code in this repository is under the Apache License 2.0.
Mask-Adapter is based on the following projects: detectron2, Mask2Former, FC-CLIP and MAFTP. Many thanks for their excellent contributions to the community.