Introduction to convolutional networks .pptx

ArunNegi37 94 views 80 slides May 24, 2024
Slide 1
Slide 1 of 80
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68
Slide 69
69
Slide 70
70
Slide 71
71
Slide 72
72
Slide 73
73
Slide 74
74
Slide 75
75
Slide 76
76
Slide 77
77
Slide 78
78
Slide 79
79
Slide 80
80

About This Presentation

Convolutional Neural Networks


Slide Content

1. Technically, deep learning CNN models to train and test, each input image will pass it through a series of convolution layers with filters ( Kernals ), Pooling, fully connected layers (FC) and apply Softmax function to classify an object with probabilistic values between 0 and 1. The below figure is a complete flow of CNN to process an input image and classifies the objects based on values. CNN

1. Convolution is the first layer to extract features from an input image. Convolution preserves the relationship between pixels by learning image features using small squares of input data. It is a mathematical operation that takes two inputs such as image matrix and a filter or kernel. Convolution Layer

1. Consider a 5 x 5 whose image pixel values are 0, 1 and filter matrix 3 x 3 as shown in below Then the convolution of 5 x 5 image matrix multiplies with 3 x 3 filter matrix which is called “Feature Map” as output shown in below Convolution Layer

1. Convolution of an image with different filters can perform operations such as edge detection, blur and sharpen by applying filters. The below example shows various convolution image after applying different types of filters (Kernels). Convolution Layer

1. Stride is the number of pixels shifts over the input matrix. When the stride is 1 then we move the filters to 1 pixel at a time. When the stride is 2 then we move the filters to 2 pixels at a time and so on. The below figure shows convolution would work with a stride of 2. Strides

1. Sometimes filter does not fit perfectly fit the input image. We have two options: • Pad the picture with zeros (zero-padding) so that it fits • Drop the part of the image where the filter did not fit. This is called valid padding which keeps only valid part of the image. Padding

1. ReLU stands for Rectified Linear Unit for a non-linear operation. The output is ƒ(x) = max(0,x). Why ReLU is important : ReLU’s purpose is to introduce non-linearity in our ConvNet . Since, the real world data would want our ConvNet to learn would be non-negative linear values. There are other non linear functions such as tanh or sigmoid that can also be used instead of ReLU . Most of the data scientists use ReLU since performance wise ReLU is better than the other two. Non Linearity ( ReLU )

1. Pooling layers section would reduce the number of parameters when the images are too large. Spatial pooling also called subsampling or downsampling which reduces the dimensionality of each map but retains important information. Spatial pooling can be of different types: • Max Pooling • Average Pooling • Sum Pooling Max pooling takes the largest element from the rectified feature map. Taking the largest element could also take the average pooling. Sum of all elements in the feature map call as sum pooling. Pooling Layer

1. The layer we call as FC layer, we flattened our matrix into vector and feed it into a fully connected layer like a neural network. In the above diagram, the feature map matrix will be converted as vector (x1, x2, x3, …). With the fully connected layers, we combined these features together to create a model. Finally, we have an activation function such as softmax or sigmoid to classify the outputs as cat, dog, car, truck etc., Fully Connected Layer

1. Complete CNN architecture

1. • Provide input image into convolution layer • Choose parameters, apply filters with strides, padding if requires. Perform convolution on the image and apply ReLU activation to the matrix. • Perform pooling to reduce dimensionality size • Add as many convolutional layers until satisfied • Flatten the output and feed into a fully connected layer (FC Layer) • Output the class using an activation function (Logistic Regression with cost functions) and classifies images. Summary of CNN

1. In neural networks, Convolutional neural network ( ConvNets or CNNs) is one of the main categories to do images recognition, images classifications. Objects detections, recognition faces etc., are some of the areas where CNNs are widely used. Application Area of CNN

How Convolutional Neural Networks Work

ConvNet : X’s and O’s Says whether a picture is of an X or an O X or O CNN A two-dimensional array of pixels

For example CNN X CNN O

Trickier cases CNN X CNN O translation scaling weight rotation

Deciding is hard = ?

What computers see = ?

What computers see

Computers are literal = x

ConvNets match pieces of the image = = =

Features match pieces of the image

Filtering: The math behind the match

Filtering: The math behind the match Line up the feature and the image patch. Multiply each image pixel by the corresponding feature pixel. Add them up. Divide by the total number of pixels in the feature.

Filtering: The math behind the match 1 x 1 = 1

Filtering: The math behind the match 1 x 1 = 1

Filtering: The math behind the match -1 x -1 = 1

Filtering: The math behind the match -1 x -1 = 1

Filtering: The math behind the match -1 x -1 = 1

Filtering: The math behind the match 1 x 1 = 1

Filtering: The math behind the match -1 x -1 = 1

Filtering: The math behind the match -1 x -1 = 1

Filtering: The math behind the match -1 x -1 = 1

Filtering: The math behind the match 1 x 1 = 1

Filtering: The math behind the match  

Filtering: The math behind the match 1 x 1 = 1

Filtering: The math behind the match -1 x 1 = -1

Filtering: The math behind the match

Filtering: The math behind the match   . 55

Convolution: Trying every possible match

Convolution: Trying every possible match =

= = =

Convolution layer One image becomes a stack of filtered images

Convolution layer One image becomes a stack of filtered images

Pooling: Shrinking the image stack Pick a window size (usually 2 or 3). Pick a stride (usually 2). Walk your window across your filtered images. From each window, take the maximum value.

Pooling maximum

Pooling maximum

Pooling maximum

Pooling maximum

Pooling maximum

Pooling max pooling

Pooling layer A stack of images becomes a stack of smaller images.

Normalization Keep the math from breaking by tweaking each of the values just a bit. Change everything negative to zero.

Rectified Linear Units (ReLUs)

Rectified Linear Units (ReLUs)

Rectified Linear Units (ReLUs)

Rectified Linear Units (ReLUs)

ReLU layer A stack of images becomes a stack of images with no negative values.

Layers get stacked The output of one becomes the input of the next. Convolution ReLU Pooling

Deep stacking Layers can be repeated several (or many) times. Convolution ReLU Pooling Convolution ReLU Convolution ReLU Pooling

Fully connected layer Every value gets a vote

Vote depends on how strongly a value predicts X or O

Prediction Of Image Using Convolutional Neural Networks – Fully Connected Layer

Fully connected layer Vote depends on how strongly a value predicts X or O X O

Fully connected layer Vote depends on how strongly a value predicts X or O X O

Fully connected layer Future values vote on X or O X O

Fully connected layer Future values vote on X or O X O

Fully connected layer Future values vote on X or O X O .92

Fully connected layer Future values vote on X or O X O .92

Fully connected layer Future values vote on X or O X O .92 .51

Fully connected layer Future values vote on X or O X O .92 .51

Fully connected layer A list of feature values becomes a list of votes. X O

Fully connected layer These can also be stacked. X O