UNIT-IV.pptx Deep Learning overview algo

UNIT-IV DEEP LEARNING ARCHITECTURES

CONTENTS Introduction to Convolution Neural Networks Kernel filter Principles behind CNNs Multiple Filters CNN applications Introduction to Recurrent Neural Networks Introduction to AutoEncoders

INTRODUCTION TO CNN Convolutional neural networks (CNN) isa type of supervised deep learning algorithm. A convolutional neural network is an extension of artificial neural networks (ANN) and is predominantly used for image recognition-based tasks. The different steps that go into creating a convolutional neural network are: 1. Image channels 2. Convolution 3. Pooling 4. Flattening 5. Full connection

Image channels The first step in the process of making an image compatible with the CNN algorithm is to find a way to represent the image in a numerical format. The image is represented using its pixel. Each pixel within the image is mapped to a number between 0 and 255. Each number represents a color ranging between 0 for white and 255 for black. Convolution Convolution is an operation where one function modifies (or convolves) the shape of another.Convolutions in images are generally applied for various reasons such as to sharpen, smooth, and intensify. In CNN, convolutions are applied to extract the prominent features within the images.

Pooling layers Pooling is the process of summarizing the features within a group of cells in the feature map. This summary of cells can be acquired by taking the maximum, minimum, or average within a group of cells. Each of these methods is referred to as min, max, and average pooling, respectively.In CNN, pooling is applied to feature maps that result from each filter. Flattening The final step in this process is to make the outcomes of CNN be compatible with an artificial neural network. The inputs to ANN should be in the form of a vector. To support that, flattening is applied which is the step to convert the multidimensional array into an nX1 vector.In CNN, flattening is applied to feature maps that result from each filter.

Kernel Filter Kernel filters are small matrices used in Convolutional Neural Networks (CNNs) to detect features like edges, textures, or patterns in images. They slide over the input, performing element-wise multiplication and summation to produce a feature map. Each filter learns to recognize specific features during training, helping the network focus on important visual information. These filters are essential for reducing data complexity and improving image recognition tasks. Multiple kernels are used in each convolutional layer to extract different features simultaneously. This process not only helps in reducing the complexity of the input by focusing on the most important parts but also makes the model more robust to shifts and distortions in the image. Overall, kernel filters are fundamental components that enable CNNs to perform tasks such as image classification, object detection, and visual recognition with high accuracy.

Principles behind CNN 1. Convolution (Local Receptive Fields) Applies filters to small regions of the input to extract local features like edges or textures. 2. Weight Sharing Uses the same filter across the entire input, reducing the number of parameters and enabling pattern detection in different locations. 3. Hierarchical Feature Learning Builds complex patterns layer by layer—starting from simple features in early layers to more abstract ones in deeper layers. 4. Pooling (Subsampling) Reduces the spatial size of feature maps to lower computational load and make features more robust to spatial changes.

5. Activation Functions Applies non-linear functions (like ReLU) after convolutions to enable learning of complex patterns. 6. Fully Connected Layers Connects every neuron from the last layer to the output layer for classification or regression tasks. 7. Backpropagation and Training Updates weights through gradient descent using a loss function, allowing the network to learn from data.

Multiple Filters In a Convolutional Neural Network (CNN), each convolutional layer typically uses not just one, but many filters . This is a fundamental principle that allows CNNs to learn diverse and complex representations from data, especially images. What Are Filters? A filter (or kernel) is a small matrix of weights (e.g., 3×3 or 5×5) that slides over the input image. It performs element-wise multiplication with a part of the image (called the receptive field), producing a single number. This process is repeated over the whole image to create a feature map . Why Use Multiple Filters? Single Filter : Can only detect one type of pattern (e.g., vertical edges). Multiple Filters : Allow the network to detect a variety of patterns—such as edges, textures, corners, gradients, and shapes—all at once. For example, a convolutional layer might use 32, 64, or even 512 filters in parallel.

How It Works Each filter learns to activate in response to a specific visual pattern.After the convolution operation, each filter produces one feature map .These feature maps are then stacked together along the depth dimension (channel-wise), forming a multi-channel output (e.g., from [H×W×3] RGB image to [H×W×64] feature volume). Example Let’s say the first convolutional layer has 32 filters: Input: 28×28×1 (e.g., grayscale image) After convolution: 28×28×32 (32 feature maps stacked depth-wise) Each of the 32 filters has learned to detect something unique from the input.

Applications of Convolutional Neural Networks 1.Image Classification 2. Object Detection and Localization 3. Image Segmentation 4. Medical Image Analysis 5. Video Analysis and Surveillance 6. Facial Recognition 7. Neural Style Transfer and Image Generation 8. Remote Sensing and Satellite Imaging 9. Handwriting and Text Recognition 10. Robotics and Autonomous Vehicles

Recurrent Neural Networks In time series data, the current observation depends on previous observations, and thus observations are not independent from each other. Traditional neural networks, however, view each observation as independent as the networks are not able to retain past or historical information. Basically, they have no memory of what happened in the past. This led to the rise of Recurrent Neural Networks (RNNs), which introduced the concept of memory to neural networks by including the dependency between data points. With this, RNNs can be trained to remember concepts based on context, i.e., learn repeated patterns.

Disadvantages of RNNs 1. Vanishing and Exploding Gradients 2. Difficulty in Capturing Long-Term Dependencies 3. Slow and Computationally Expensive Training 4. Limited Memory Capacity 5. Inefficient Parallelization

Long-Short-Term-Memory (LSTM) LSTMs are a special type of RNNs which tackle the main problem of simple RNNs, the problem of vanishing gradients, i.e., the loss of information that lies further in the past. The LSTM network architecture consists of three parts, as shown in the image below, and each part performs an individual function.

The first part chooses whether the information coming from the previous timestamp is to be remembered or is irrelevant and can be forgotten. In the second part, the cell tries to learn new information from the input to this cell. At last, in the third part, the cell passes the updated information from the current timestamp to the next timestamp. This one cycle of LSTM is considered a single-time step. These three parts of an LSTM unit are known as gates. They control the flow of information in and out of the memory cell or lstm cell. The first gate is called Forget gate, the second gate is known as the Input gate, and the last one is the Output gate. An LSTM unit that consists of these three gates and a memory cell or lstm cell can be considered as a layer of neurons in traditional feedforward neural network, with each neuron having a hidden layer and a current state.

Introduction to Auto Encoders Autoencoders are neural network-based models that are used for unsupervised learning purposes to discover underlying correlations among data and represent data in a smaller dimension. The autoencoders frame unsupervised learning problems as supervised learning problems to train a neural network model. The input is squeezed down to a lower encoded representation using an encoder network, then a decoder network decodes the encoding to recreate the input.The encoding produced by the encoder layer has a lower-dimensional representation of the data and shows several interesting complex relationships among data.

An Autoencoder has the following parts: Encoder: The encoder is the part of the network which takes in the input and produces a lower Dimensional encoding Bottleneck: It is the lower dimensional hidden layer where the encoding is produced. The bottleneck layer has a lower number of nodes and the number of nodes in the bottleneck layer also gives the dimension of the encoding of the input. Decoder: The decoder takes in the encoding and recreates back the input.

Phi and Theta are the representing parameters of the encoder and decoder respectively. The target of this model is such that the Input is equivalent to the Reconstructed Output. To achieve this we minimize a loss function named Reconstruction Loss. Basically, Reconstruction Loss is given by the error between the input and the reconstructed output. It is usually given by the Mean Square error or Binary Crossentropy between the input and reconstructed output. Binary Crossentropy is used if the data is binary.

Applications of Auto Encoders 1. Autoencoders are used largely for anomaly detection: Autoencoders create encodings that basically capture the relationship among data. Now, if we train our autoencoder on a particular dataset, the encoder and decoder parameters will be trained to represent the relationships on the datasets the best way. Thus, will be able to recreate any data given from that kind of dataset in the best way. So, if data from that particular dataset is sent through the autoencoder, the reconstruction error is less 2. Autoencoders are used for Noise Removal: If we can pass the noisy data as input and clean data as output and train an autoencoder on such given data pairs, trained Autoencoders can be highly useful for noise removal. This is because noise points usually do not have any correlations. Now, as the autoencoders need to represent the data in the lowest dimensions, the encodings usually have only the important relations there exists, rejecting the random ones. So, the decoded data coming out as output of an autoencoder is free of all the extra relations and hence the noise. 3. Auto Encoders as Generative models: Before the GAN’s came into existence, Autoencoders were used as generative models. One of the modified approaches of autoencoders, variational autoencoders are used for generative purposes.

THANK YOU

UNIT-IV.pptx Deep Learning overview algo

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

UNIT-IV.pptx Deep Learning overview algo

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

8-top-ai-courses-for-customer-support-representatives-in-2025.pptx

7-essential-ai-courses-for-call-center-supervisors-in-2025.pptx

25-essential-ai-courses-for-user-support-specialists-in-2025.pptx

8-essential-ai-courses-for-insurance-customer-service-representatives-in-2025.pptx

Know for Certain

PPT OPD LES 3ertt4t4tqqqe23e3e3rq2qq232.pptx