Skip to content

Pytorch implementation of convolutional neural network visualization techniques

License

Notifications You must be signed in to change notification settings

mariopi27/pytorch-cnn-visualizations

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

71 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Convolutional Neural Network Visualizations

This repo contains following CNN operations implemented in Pytorch:

  • Gradient visualization with vanilla backpropagation
  • Gradient visualization with guided backpropagation [1]
  • Gradient visualization with saliency maps [4]
  • Gradient-weighted [3] class activation mapping [2]
  • Guided gradient-weighted class activation mapping [3]
  • CNN filter visualization [9]
  • Deep Dream [10]
  • Class specific image generation (A generated image that maximizes a certain class) [4]
  • Fooling images (Unrecognizable images predicted as classes with high confidence) [7]
  • Fooling images disguised as another image (Picture of ipod being predicted as horse) [7]

It will also include following operations in near future as well:

  • Inverted Image Representations [5]
  • Weakly supervised object segmentation [4]
  • Semantic Segmentation with Deconvolutions [6]
  • Smooth Grad [8]

The code uses pretrained VGG19, VGG16 and AlexNet in the model zoo. Some of the code assumes that the layers in the model are separated into two sections; features, which contains the convolutional layers and classifier, that contains the fully connected layer (after flatting out convolutions). If you want to port this code to use it on your model that does not have such separation, you just need to do some editing on parts where it calls model.features and model.classifier.

All images are pre-processed with mean and std of the ImageNet dataset before being fed to the model. None of the code uses GPU as these operations are quite fast (for a single image). You can make use of gpu with very little effort. The examples below include numbers in the brackets after the description, like Mastiff (243), this number represents the class id in the ImageNet dataset.

I tried to comment on the code as much as possible, if you have any issues understanding it or porting it, don't hesitate to reach out.

Below, are some sample results for each operation.

Gradient Visualization and Segmentation

Target class: King Snake (56) Target class: Mastiff (243) Target class: Spider (72)
Original Image
Colored Vanilla Backpropagation
Vanilla Backpropagation Saliency
Colored Guided Backpropagation

(GB)
Guided Backpropagation Saliency

(GB)
Guided Backpropagation Negative Saliency

(GB)
Guided Backpropagation Positive Saliency

(GB)
Gradient-weighted Class Activation Map

(Grad-CAM)
Gradient-weighted Class Activation Heatmap

(Grad-CAM)
Gradient-weighted Class Activation Heatmap on Image

(Grad-CAM)
Colored Guided Gradient-weighted Class Activation Map

(Guided-Grad-CAM)
Guided Gradient-weighted Class Activation Map Saliency

(Guided-Grad-CAM)

Convolutional Neural Network Filter Visualization

CNN filters can be visualized when we optimize the input image with respect to output of the specific convolution operation. For this example I used a pre-trained VGG16. Visualizations of layers start with basic color and direction filters at lower levels. As we approach towards the final layer the complexity of the filters also increases. If you employ techniques like blurring, gradient clipping etc. you will probably produce better images.

Layer 2
(Conv 1-2)
Layer 10
(Conv 2-1)
Layer 17
(Conv 3-1)
Layer 24
(Conv 4-1)

Deep Dream

Deep dream is technically the same operation as layer visualization the only difference is that you don't start with a random image but use another picture. The samples below were created with VGG19, the produced result is entirely up to the filter so it is kind of hit or miss. The more complex models produce mode high level features, meaning that If you replace VGG19 with an Inception variant you will get more noticable shapes when you target higher conv layers. Like layer visualization, if you employ additional techniques like gradient clipping, blurring etc. you might get better visualizations.

Original Image
VGG19
Layer: 34
(Final Conv. Layer) Filter: 94
VGG19
Layer: 34
(Final Conv. Layer) Filter: 103

Class Specific Image Generation

This operation produces different outputs based on the model and the applied regularization method. Below, are some samples produced with L2 regularization from VGG19. Note that these images are generated with regular CNNs with optimizing the input (rather than the model weights) and not with GANs.

Target class: Worm Snake (52) - (VGG19) Target class: Spider (72) - (VGG19)

The samples below show the produced image with no regularization, l1 and l2 regularizations on target class: flamingo (130) to show the differences between regularization methods. These images are generated with a pretrained AlexNet.

No Regularization L1 Regularization L2 Regularization

Produced samples can further be optimized to resemble the desired target class, some of the operations you can incorporate to improve quality are; blurring, clipping gradients that are below a certain treshold, random color swaps on some parts, random cropping the image, forcing generated image to follow a path to force continuity.

Fooling Image Generation

This operation is quite similar to generating class specific images, we start with a random image and continously update the image with targeted backpropagation (for a certain class) and stop when we achieve target confidence for that class. All of the below images are generated from pretrained AlexNet to fool it.

Predicted as Zebra (340)
Confidence: 0.94
Predicted as Bow tie (457)
Confidence: 0.95
Predicted as Castle (483)
Confidence: 0.99

Disguised Fooling Image Generation

For this operation we start with an image and perform gradient updates on the image for a specific class but with smaller learning rates so that the original image does not change too much. As it can be seen from samples, on some images it is almost impossible to recognize the difference between two images but on others it can clearly be observed that something is wrong. All of the examples below were created from and tested on AlexNet to fool it.

Predicted as Eel (390)
Confidence: 0.96
Predicted as Apple (948)
Confidence: 0.95
Predicted as Snowbird (13)
Confidence: 0.99
Predicted as Banjo (420)
Confidence: 0.99
Predicted as Abacus (457)
Confidence: 0.99
Predicted as Dumbell (543)
Confidence: 1

Requirements:

torch >= 0.2.0.post4
torchvision >= 0.1.9
numpy >= 1.13.0
opencv >= 3.1.0

References:

[1] J. T. Springenberg, A. Dosovitskiy, T. Brox, and M. Riedmiller. Striving for Simplicity: The All Convolutional Net, https://arxiv.org/abs/1412.6806

[2] B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, A. Torralba. Learning Deep Features for Discriminative Localization, https://arxiv.org/abs/1512.04150

[3] R. R. Selvaraju, A. Das, R. Vedantam, M. Cogswell, D. Parikh, and D. Batra. Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization, https://arxiv.org/abs/1610.02391

[4] K. Simonyan, A. Vedaldi, A. Zisserman. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps, https://arxiv.org/abs/1312.6034

[5] A. Mahendran, A. Vedaldi. Understanding Deep Image Representations by Inverting Them, https://arxiv.org/abs/1412.0035

[6] H. Noh, S. Hong, B. Han, Learning Deconvolution Network for Semantic Segmentation https://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Noh_Learning_Deconvolution_Network_ICCV_2015_paper.pdf

[7] A. Nguyen, J. Yosinski, J. Clune. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images https://arxiv.org/abs/1412.1897

[8] D. Smilkov, N. Thorat, N. Kim, F. Viégas, M. Wattenberg. SmoothGrad: removing noise by adding noise https://arxiv.org/abs/1706.03825

[9] MD. Zeiler, R. Fergus. Visualizing and Understanding Convolutional Networks https://arxiv.org/abs/1311.2901

[10] A. Mordvintsev, C. Olah, M. Tyka. Inceptionism: Going Deeper into Neural Networks https://research.googleblog.com/2015/06/inceptionism-going-deeper-into-neural.html

About

Pytorch implementation of convolutional neural network visualization techniques

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%