Introduction to Deep Leaning(UNIT 1).pptx

SURBHISAROHA 143 views 32 slides Jul 09, 2024
Slide 1
Slide 1 of 32
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32

About This Presentation

Deep Learning


Slide Content

Introduction to Deep Learning UNIT 1 BY:SURBHI SAROHA

Syllabus INTRODUCTION : Introduction to machine learning - Linear models (SVMs and Perceptrons , logistic regression ) - Intro to Neural Nets: What a shallow network computes - Training a network: loss functions, back propagation and stochastic gradient descent - Neural networks as universal function approximates

INTRODUCTION : Introduction to machine learning Machine learning (ML) is a type of artificial intelligence (AI) that allows computers to learn without being explicitly programmed. It involves feeding data into algorithms that can then identify patterns and make predictions on new data.   Machine learning is used in a wide variety of applications, including image and speech recognition, natural language processing, and recommender systems . In the real world, we are surrounded by humans who can learn everything from their experiences with their learning capability, and we have computers or machines which work on our instructions. But can a machine also learn from experiences or past data like a human does? So here comes the role of  Machine Learning .

Cont …..

Cont … A subset of artificial intelligence known as machine learning focuses primarily on the creation of algorithms that enable a computer to independently learn from data and previous experiences. Arthur Samuel first used the term "machine learning" in 1959. It could be summarized as follows: Without being explicitly programmed, machine learning enables a machine to automatically learn from data, improve performance from experiences, and predict things. Machine learning algorithms create a mathematical model that, without being explicitly programmed, aids in making predictions or decisions with the assistance of sample historical data, or training data. For the purpose of developing predictive models, machine learning brings together statistics and computer science. Algorithms that learn from historical data are either constructed or utilized in machine learning. The performance will rise in proportion to the quantity of information we provide.

How does Machine Learning work A machine learning system builds prediction models, learns from previous data, and predicts the output of new data whenever it receives it. The amount of data helps to build a better model that accurately predicts the output, which in turn affects the accuracy of the predicted output.

Linear models (SVMs and Perceptrons , logistic regression) Linear Models Linear models are a class of algorithms in machine learning that assume a linear relationship between the input variables (features) and the output variable. They are simple yet powerful models used for both classification and regression tasks. 1. Support Vector Machines (SVMs) Support Vector Machines are supervised learning models used for classification and regression. The main idea behind SVMs is to find the hyperplane that best separates the data into different classes. The hyperplane is chosen such that it maximizes the margin, which is the distance between the hyperplane and the nearest data points from either class, known as support vectors. Linear SVM : When the data is linearly separable, a linear hyperplane can be used to separate the classes. Non-linear SVM : For non-linearly separable data, kernel functions (e.g., polynomial, RBF) can be used to transform the data into a higher-dimensional space where it becomes linearly separable.

2. Perceptrons The Perceptron is the simplest type of artificial neural network, suitable for binary classification tasks. It consists of a single layer of weights and a bias term, and uses a step activation function. Learning Algorithm : The perceptron learning algorithm adjusts the weights based on the error of the prediction. If the prediction is correct, no change is made. If incorrect, the weights are updated to reduce the error. Advantages of Perceptrons : Simple and easy to implement. Fast convergence for linearly separable data. Disadvantages of Perceptrons : Cannot solve problems that are not linearly separable. Limited to binary classification.

Logistic regression Logistic regression is a statistical method for analyzing datasets in which there are one or more independent variables that determine an outcome. The outcome is typically a binary variable (0 or 1, true or false, yes or no). It is used to predict the probability of a binary response based on one or more predictor variables (features ). Logistic regression  is a  supervised machine learning algorithm  used for  classification tasks  where the goal is to predict the probability that an instance belongs to a given class or not. Logistic regression is a statistical algorithm which analyze the relationship between two data factors.

Types of Logistic Regression On the basis of the categories, Logistic Regression can be classified into three types: Binomial:  In binomial Logistic regression, there can be only two possible types of the dependent variables, such as 0 or 1, Pass or Fail, etc. Multinomial:  In multinomial Logistic regression, there can be 3 or more possible unordered types of the dependent variable, such as “cat”, “dogs”, or “sheep” Ordinal:  In ordinal Logistic regression, there can be 3 or more possible ordered types of dependent variables, such as “low”, “Medium”, or “High”.

Intro to Neural Nets: What a shallow network computes Neural networks mimic the basic  functioning  of the human brain and are inspired by how the human brain interprets information . They solve various real-time tasks because of its ability to perform computations quickly and its fast responses .

Cont … Artificial Neural Network has a huge number of interconnected processing elements, also known as Nodes. These nodes are connected with other nodes using a connection link . The connection link contains weights, these weights contain the information about the input signal. Each iteration and input in turn leads to updation of these weights. After inputting all the data instances from the training data set, the final weights of the Neural Network along with its architecture is known as the Trained Neural Network. This process is called Training of Neural Networks . These trained neural networks solve specific problems as defined in the problem statement.

What are Neural Networks Used For? Neural networks are employed across various domains for: Identifying objects, faces, and understanding spoken language in applications like self-driving cars and voice assistants. Analyzing and understanding human language, enabling sentiment analysis, chatbots , language translation, and text generation. Diagnosing diseases from medical images, predicting patient outcomes, and drug discovery. Predicting stock prices, credit risk assessment, fraud detection, and algorithmic trading.

Cont …. Personalizing content and recommendations in e-commerce, streaming platforms, and social media. Powering robotics and autonomous vehicles by processing sensor data and making real-time decisions. Enhancing game AI, generating realistic graphics, and creating immersive virtual environments. Monitoring and optimizing manufacturing processes, predictive maintenance, and quality control. Analyzing complex datasets, simulating scientific phenomena, and aiding in research across disciplines . Generating music, art, and other creative content .

Types of Neural Networks in Machine Learning Artificial Neural Network (ANN) ANN  is also known as an artificial neural network. It is a feed-forward neural network because the inputs are sent in the forward direction. It can also contain hidden layers which can make the model even denser. They have a fixed length as specified by the programmer. It is used for Textual Data or Tabular Data. A widely used real-life application is Facial Recognition. It is comparatively less powerful than CNN and RNN. (Convolution Neural Network (CNN) CNNs  is mainly used for Image Data. It is used for Computer Vision. Some of the real-life applications are object detection in autonomous vehicles. It contains a combination of convolutional layers and neurons. It is more powerful than both ANN and RNN.

Cont … Recurrent Neural Network (RNN) It is also known as  RNNs . It is used to process and interpret time series data. In this type of model, the output from a processing node is fed back into nodes in the same or previous layers. The most known types of RNN are  LSTM  (Long Short Term Memory) Networks.

Types of Learnings in Neural Networks Supervised Learning Unsupervised Learning Reinforcement Learning

Supervised Learning As the name suggests  Supervised Learning , it is a type of learning that is looked after by a supervisor . It is like learning with a teacher. There are input training pairs that contain a set of input and the desired output . Here the output from the model is compared with the desired output and an error is calculated, this error signal is sent back into the network for adjusting the weights . This adjustment is done till no more adjustments can be made and the output of the model matches the desired output. In this, there is feedback from the environment to the model.

Unsupervised Learning   Unlike supervised learning, there is no  supervisor  or a teacher here. In this type of learning, there is no feedback from the environment, there is no desired output and the model learns on its own. During the training phase, the inputs are formed into classes that define the similarity of the members. Each class contains similar input patterns. On inputting a new pattern, it can predict to which class that input belongs based on similarity with other patterns. If there is no such class, a new class is formed.

Reinforcement Learning  It gets the best of both worlds, that is, the best of both Supervised learning and Unsupervised learning . It is like  le arning  with a critique. Here there is no exact feedback from the environment, rather there is critique feedback. The critique tells how close our solution is. Hence the model learns on its own based on the critique information. It is similar to supervised learning in that it receives feedback from the environment, but it is different in that it does not receive the desired output information, rather it receives critique information.

Training a network: loss functions, back propagation and stochastic gradient descent Training a neural network involves several key concepts and steps. Here's an overview: Loss Functions Backpropagation Stochastic Gradient Descent (SGD) 1. Loss Functions A loss function measures how well the neural network's predictions match the actual target values. It quantifies the difference between the predicted output and the actual output. Common loss functions include :

Cont …. 2. Backpropagation Backpropagation is the process of computing gradients of the loss function with respect to each weight in the network. This involves two passes through the network: Forward Pass: Compute the output of the network and the loss. Backward Pass: Calculate the gradient of the loss with respect to each weight using the chain rule of calculus. This involves: Calculating the error at the output layer. Propagating this error backward through the network layers to compute the gradient of the loss with respect to each weight.

Cont ….. 3. Stochastic Gradient Descent (SGD ) SGD is an optimization algorithm used to minimize the loss function. It updates the network's weights iteratively to reduce the loss. The process involves: Initialize Weights : Start with random weights. Repeat Until Convergence : Sample a Batch : Select a random batch of training data. Forward Pass : Compute the predictions for this batch. Compute Loss : Calculate the loss for the batch. Backward Pass : Compute the gradients of the loss with respect to the weights. Update Weights : Adjust the weights using the gradients.

Summary Loss Functions measure the error of the network's predictions. Backpropagation calculates the gradients of the loss with respect to the network's weights. Stochastic Gradient Descent (SGD) uses these gradients to update the weights iteratively to minimize the loss. By repeatedly applying these steps, the neural network learns to make better predictions on the training data, ultimately improving its performance on unseen data.

Neural networks as universal function approximates Neural networks are often described as universal function approximators . This concept means that a neural network can approximate any continuous function to an arbitrary degree of accuracy, given sufficient resources (such as the number of neurons and layers) and appropriate training. Here's a deeper look into this idea: Universal Approximation Theorem The Universal Approximation Theorem is a foundational result in neural network theory. It states that: For Feedforward Networks : A feedforward neural network with a single hidden layer containing a finite number of neurons can approximate any continuous function on compact subsets , given appropriate weights and biases.

Cont ….. For Activation Functions : The theorem applies to a wide variety of activation functions, including the sigmoid function, hyperbolic tangent ( tanh ), and Rectified Linear Unit ( ReLU ). Practical Considerations Training Data : The quality and quantity of training data are crucial for the network to learn a good approximation. Insufficient or poor-quality data can lead to underfitting or overfitting. Network Architecture : The architecture (number of layers and neurons) must be chosen to balance the trade-off between underfitting and overfitting. Too few neurons or layers can lead to underfitting , while too many can cause overfitting. Regularization : Techniques like dropout, weight decay, and batch normalization help prevent overfitting and improve the network's generalization capabilities. Optimization : Efficient optimization algorithms (like Adam, RMSprop , or SGD with momentum) are essential for training deep networks effectively.

Conclusion The Universal Approximation Theorem provides a theoretical foundation for the power of neural networks. It assures us that, given enough resources, a neural network can approximate any continuous function . However, practical neural network design involves considerations like network architecture, training data, regularization, and optimization to achieve good performance on real-world tasks.

THANK YOU 
Tags