The official repo of Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective (CVPR2023)
We aim at advancing blind image quality assessment (BIQA), which predicts the human perception of image quality without any reference information. We develop a general and automated multitask learning scheme for BIQA to exploit auxiliary knowledge from other tasks, in a way that the model parameter sharing and the loss weighting are determined automatically. Specifically, we first describe all candidate label combinations (from multiple tasks) using a textual template, and compute the joint probability from the cosine similarities of the visual-textual embeddings. Predictions of each task can be inferred from the joint distribution, and optimized by carefully designed loss functions. Through comprehensive experiments on learning three tasks - BIQA, scene classification, and distortion type identification, we verify that the proposed BIQA method 1) benefits from the scene classification and distortion type identification tasks and outperforms the state-of-the-art on multiple IQA datasets, 2) is more robust in the group maximum differentiation competition, and 3) realigns the quality annotations from different IQA datasets more effectively.
torch 1.8+
torchvision
Python 3
pip install ftfy regex tqdm
pip install git+https://github.com/openai/CLIP.git
python train_unique_clip_weight.py
python BIQA_benchmark.py
python demo.py
python demo2.py
Google Drive:
https://drive.google.com/file/d/1GoKwUKNR-rvX11QbKRN8MuBZw2hXKHGh/view?usp=sharing
百度网盘:
链接: https://pan.baidu.com/s/1KHjj7T8y2H_eKE6w7HnWJA 提取码: 2b8v
BIQA Model | AGIQA-3K | AGIQA-1K | SJTU-H3D | AIGCIQA2023 | Paper |
---|---|---|---|---|---|
DBCNN | 0.6454 | 0.5133 | 0.4560 | 0.7301 | TCSVT2020 |
HyperIQA | 0.6291 | 0.5253 | 0.2696 | 0.7211 | CVPR2020 |
TReS | 0.6460 | 0.5101 | 0.2700 | 0.7410 | WACV2022 |
UNIQUE | 0.6659 | 0.4596 | 0.7523 | 0.7605 | TIP2021 |
MUSIQ | 0.6294 | 0.5254 | 0.5313 | 0.7358 | ICCV2021 |
PaQ-2-PiQ | 0.5023 | 0.5378 | 0.2683 | 0.6425 | CVPR2020 |
CLIPIQA | 0.6580 | 0.3411 | -0.0793 | 0.6088 | AAAI2023 |
CLIPIQA+ | 0.6831 | 0.4461 | 0.5567 | 0.7158 | AAAI2023 |
MANIQA | 0.6950 | 0.6180 | 0.4523 | 0.7282 | CVPRW2022 |
LIQE (Ours) | 0.7212 | 0.5785 | 0.6716 | 0.7435 | CVPR2023 |
@inproceedings{zhang2023liqe,
title={Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective},
author={Zhang, Weixia and Zhai, Guangtao and Wei, Ying and Yang, Xiaokang and Ma, Kede},
booktitle={IEEE Conference on Computer Vision and Pattern Recognition},
pages={14071--14081},
year={2023}
}