Introduction to Grad-CAM (short version)

HsingchuanHsieh 372 views 24 slides Sep 19, 2021

Slide 1 of 24

About This Presentation

Introduction and comparison to CAM, Grad-CAM, Guided back propagation and Guided Grad-CAM

Size: 7.56 MB

Language: en

Added: Sep 19, 2021

Slides: 24 pages

Slide Content

Introduction to Grad-CAM Advisor: Henry Horng-Shing Lu Student s : Jane Hsing-Chuan Hsieh Date: 2021-08-10

Research context Concern: Model Transparency & Interpretability Despite unprecedented breakthroughs of CNN in a variety of computer vision tasks, their lack of decomposability into individually intuitive components makes them hard to interpret Purpose: Visualizing CNNs visualized CNN predictions by highlighting ‘important’ pixels (i.e. change in intensities of these pixels have the most impact on the prediction score) Help Users to Build Trust to AI we must build ‘transparent’ models that have the ability to explain why they predict what they predict.

Research context What makes a good visual explanation? Class Discriminative – localize the category in the image Class Activation Mapping (CAM) Gradient-weighted Class Activation Mapping (Grad-CAM) High-Resolution – capture fine-grained detail (Pixel-space gradient visualizations) Guided Back propagation Deconvolution Both – Guided Grad-CAM not class-discriminative

Outline Brief Introduction for CNN Visualizing Tools CAM Grad-CAM Guided back propagation Guided Grad-CAM

1. Introduction CAM Grad-CAM Guided back propagation Guided Grad-CAM

Preface Convolutional layers of CNNs actually behave as object detectors (i.e., to localize objects) despite no supervision on the location of the object is provided In other words, convolutional layers naturally retain spatial information E.g., for action classification, CNN is able to localize the discriminative regions as the objects that the humans are interacting with rather than the humans themselves

Preface However, this ability ( spatial information / object detectors ) is lost in fully-connected layers  So we expect t he last convolutional layer have the most detailed spatial information The higher the convolutional layers are, the higher level of semantics are extracted

CAM F or a particular category ( ) , a Class Activation Map (CAM) indicates the discriminative image regions used by the CNN to identify that category Characteristics Replace fully-connected layers with global average pooling (GAP) layers to minimize the number of parameters while maintaining high performance act as structural regularizer , preventing overfitting during training

CAM CNN Architecture For each feature map ( ) at the last convolutional layer, GAP outputs the spatial average of each feature map For a given class , the input for output layer: ( : importance of for class ) Output score for class : (e.g., softmax )

CAM CAM Procedure Weights ( , , …, ) of output layer indicate the importance of the image regions ( ) to a specific class ( )  Compute CAM: Note: if the shape (H, W) of CAM ( is different from that of input images, up-sampling is needed to equalize the shapes

CAM: Properties CAM trades off model complexity and performance (using global average pooling (GAP) ) for more transparency Shortage To apply CAM, any CNN-based network must change its architecture, where GAP is a must before the output layer i.e., architectural changes and hence re-training is needed

Grad-CAM Gradient-weighted Class Activation Mapping (Grad-CAM) generalizes CAM for a wide variety of CNN-based architectures i.e., without requiring architectural changes or re-training Characteristics Without GAP layer, we need a way to define weights –  Grad-CAM uses the gradients of any target concept ( ) (e.g., ‘dog’ in a classification network) flowing into the final convolutional layer , and derive summary statistics out of it to represent the weights ( importance ) Source: Selvaraju , Ramprasaath R., et al. "Grad-cam: Visual explanations from deep networks via gradient-based localization." Proceedings of the IEEE international conference on computer vision . 2017.

Grad-CAM Procedure For a given class , compute the gradient of its score– (before the softmax ), w.r.t. each feature map activations of a convolutional layer, i.e. Define the importance weights of feature map via GAP:  I nfluence of to

Grad-CAM Procedure  Compute Grad-CAM: ReLU is applied because we are only interested in the features (neurons ) that have a positive influence on the class of interest i.e. pixels whose intensity should be increased in order to increase Note: if the shape (u, v) of is different from that of input images, up-sampling is needed to equalize the shapes

GRAD -CAM: Properties Grad-CAM generates visual explanations for a wide variety of CNN-based networks without requiring architectural changes or re-training.

GRAD -CAM: Properties Grad-CAM can help identify the biases in dataset Models trained on biased datasets may not generalize to real-world scenarios, or worse, may perpetuate biases and stereotypes ( w.r.t. gender, race, age, etc.) E.g., for a “doctor” vs. “nurse” binary classification task Biased model had learned to look at the person’s face hairstyle to distinguish nurses from doctors  thus learning gender stereotype Unbiased model made the right prediction looking at the white coat, and the stethoscope

GRAD -CAM: Properties Shortage The generated localization map (heatmap) from Grad-CAM (also CAM) is coarse (low-resolution)  unclear enough why the network predicts a particular instance (e.g., “tiger cat”) Guided Back Propagation is another approach to provide high-resolution map i.e. fine-grained detail, or pixel- space gradient visualizations

Guided Back Propagation: Another Approach Guided Backpropagation visualizes gradients of the network’s prediction (i.e., output neuron) w . r . t . the input image This determines which pixels need to be changed the least to affect the prediction the most (i.e., higher absolute gradients) N egative gradients are suppressed through ReLU when backpropagating because we are only interested in the pixels that increase the activation of the output neuron, rather than suppressing it

Guided Back Propagation: Properties Guided Back Propagation is high-resolution since it derive gradients directly w . r . t . the input image instead of w . r . t . last convolutional Layer (i.e., Grad-CAM) Shortage Not class-discriminative  Guided Grad-CAM combines Guided backpropagation and Grad-CAM, and thus becomes class- discriminative

Guided Grad-CAM Characteristics Guided Grad-CAM is both high-resolution and class-discriminat ive Procedure Fusing Guided Back Propagation with Grad-CAM to create Guided Grad-CAM visualizations

Guided Grad-CAM: Properties Guided Grad-CAM also help untrained users successfully discern a ‘stronger’ network from a ‘weaker’ one , even when both make identical predictions. stronger network weaker network 2 models (A vs B) with same prediction accuracies

Thank for your attention

Guided Back Propagation

Guided Back Propagation: Properties because guided backpropagation adds an additional guidance signal from the higher layers to usual backpropagation. This prevents backward flow of negative gradients, corresponding to the neurons which decrease the activation of the higher layer unit we aim to visualize

Introduction to Grad-CAM (short version)

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Introduction to Grad-CAM (short version)

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

MGV Residential Design projects for different clients, including a New Mexico Adobe project-1-.pdf

EUNITED_Advocacy and Public Engagement through Visual Media

DESIGN THINKINGGG PPT 2 TOPIC IDEATION.pptx

DESIGN THINKING CHAPTER 1 PPTT PPT 1.pptx

Hinduism and Its History - PowerPoint Slides.pptx

Service Attributes of Manufactured Parts.pptx