This repository is implementation of "Building Reliable Explanations of Unreliable Neural Networks: Locally Smoothing Perspective of Model Interpretation", CVPR 2021.
arXiv link: https://arxiv.org/abs/2103.14332
- Dohun Lim
- Hyeonseok Lee
- Sungchan Kim
- Adversarial method
method Untargeted, PGD Targeted, Structured Targeted, Unstructured state ✅ ✅ ✅ dependency CleverHans N/A N/A - Saliency method
method Ours, RelEx Real Time Saliency GradCAM DeepLIFT SmoothGrad Integrated Gradient Simple Gradient state ✅ ✅ ✅ ✅ ✅ ✅ ✅ dependency N/A N/A N/A Captum N/A N/A N/A
Dependencies of this implementaion are the following.
- Python == 3.6
- PyTorch >= 1.7
- CleverHans == 4.0.0
pip install git+https://github.com/tensorflow/cleverhans.git#egg=cleverhans
- Captum >= 0.3.1
pip install captum
Generating adversarial example and saliency via the command.
python generate.py [(options)]
Evaluating saliency via the command.
python eval.py [(options)]
- --robust: Flag whether select naturally trained or adversarilly trained model
import os
from models.saliency import RelEx
from models.network import load_network
from utils import load_image
workspace_dir = os.path.dirname(__file__)
data_root_dir = os.path.join(workspace_dir, 'data')
x_name = 'ILSVRC2012_val_00023552'
if torch.cuda.is_available():
device = torch.device('cuda')
else:
device = torch.device('cpu')
x_full_dir = os.path.join(data_root_dir, x_name + '.JPEG')
x = load_image(x_full_dir, gpu=True)[0]
net = load_network('resnet50', encoder=False, robust=False).to(device)
target_cls = net(x).max(1)[1]
relex = RelEx(net, device=device)
sal, accu = relex(x, target_cls)
- Adversarial
- Untargeted, PGD: Towards Deep Learning Models Resistant to Adversarial Attacks
- Targeted, Structured: Explanations can be manipulated and geometry is to blame
- Targeted, Unstructured: Interpretation of Neural Networks is Fragile
- Saliency
- Real Time Saliency: Real Time Image Saliency for Black Box Classifiers
- GradCAM: Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization
- DeepLIFT: Learning Important Features Through Propagating Activation Differences
- SmoothGrad: SmoothGrad: removing noise by adding noise
- Integrated Gradient: Axiomatic Attribution for Deep Networks
- Simple Gradient: Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps
- Network
- Robust ResNet-50: Robustness May Be at Odds with Accuracy