Official implementation of the ICCV2023 paper: Enhancing Generalization of Universal Adversarial Perturbation through Gradient Aggregation
Deep neural networks are vulnerable to universal adversarial perturbation (UAP), an instance-agnostic perturbation capable of fooling the target model for most samples. Compared to instance-specific adversarial examples, UAP is more challenging as it needs to generalize across various samples and models. In this paper, we examine the serious dilemma of UAP generation methods from a generalization perspective -- the gradient vanishing problem using small-batch stochastic gradient optimization and the local optima problem using large-batch optimization. To address these problems, we propose a simple and effective method called Stochastic Gradient Aggregation (SGA), which alleviates the gradient vanishing and escapes from poor local optima at the same time. Specifically, SGA employs the small-batch training to perform multiple iterations of inner pre-search. Then, all the inner gradients are aggregated as a one-step gradient estimation to enhance the gradient stability and reduce quantization errors. Extensive experiments on the standard ImageNet dataset demonstrate that our method significantly enhances the generalization ability of UAP and outperforms other state-of-the-art methods.
Torch
We train SGA with Torch 1.10.0 and torchvision 0.11.0.
Dataset Refer to instructions here for downloading and preparing the ImageNet dataset, put the Image folder under the imagenet folder:
- imagenet/
train/n01440764/
n01440764_10026.JPEG
n01440764_10027.JPEG
...
val/n01440764/
ILSVRC2012_val_00000293.JPEG
...
Pretrained backbone model
Pre-trained ImageNet models are available online via torchvision.
UAP generation
To generate UAP of ImageNet dataset by applying SGA with cross-entropy loss, based on surrgate model "VGG16", please do as follows. Other surrogate models can be used modifying "--model_name". Other parameters, please refer to the code.
cd src
sh train.sh
UAP evaluation
After generating the UAP, please refer to the evaluation part for fooling ratio. For ImageNet dataset, evaluation towards five target models is as follows.
cd src
sh gen_sga_eval.sh
Fooling ratio
The UAP generated by SGA on ImageNet test set in the white-box setting achieve over 95% fooling rate on average.
Citation
If you find our code useful, please consider citing our paper:
@article{liu2023enhancing,
title={Enhancing Generalization of Universal Adversarial Perturbation through Gradient Aggregation},
author={Liu, Xuannan and Zhong, Yaoyao and Zhang, Yuhang and Qin, Lixiong and Deng, Weihong},
journal={arXiv preprint arXiv:2308.06015},
year={2023}
}
This project is built on the open source repository sgd-uap-torch. Thanks the team for their impressive work!