hal-lectures-01231452-dl-sunilpatnaik.pptx

SatvikEinstein 4 views 23 slides Aug 19, 2024
Slide 1
Slide 1 of 23
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23

About This Presentation

neural networks


Slide Content

Fundamentals of Deep Neural Networks Jiaul Paik Email: [email protected] 1

First Things to Note In Machine Learning / Deep Learning We almost always deal with Mathematical function A model almost always refers to a mathematical function INPUT F() Output / Target 2

ML: Design Pattern Problem statement: Know your data well What is the input? What is your output /target Define your model: Choose a reasonable math function Contains unknown parameters Define the objective function: Error / loss / profit Optimize the objective: Use training data and an optimization method to estimate parameters Now you have all for your model: Freeze the model and start using it 3

Let us Start with a Problem 4

Movie R ecommendation You have two friends John and Mary. You have prior data telling the movies you liked or disliked that they rated. Now a new movie named ‘gravity’ has been released that they rated and you want to decide whether you should watch it. 5

Know Your Data Sample/example Features / input variables Output / target 6

Logistic Regression and Decision Boundary 7

Preliminaries Inherently a 2-class classifier The classes traditionally called 0 and 1 OR positive and negative Logistic Regression is a probabilistic classifier For an input feature vector , computes the probability that is from class 1 (or positive)   8

Logistic Regression Model Main assumption The logarithm of odds is a linear function of the features of input x Mathematically [assume that x has n features: ] Organizing the above you get For convenience we write, instead   9

What kind of boundary it can model?   Class 0 Class 1 Z > 0 > 0.5   Z < 0 < 0.5   10

What about non-linear boundary? 11

Dealing with Non-linear Boundary Z determines the boundary In this case, Z needs to be a quadratic function 12

Getting Model Parameters 13

Estimating Model Parameters Take one sample from the training data Let be the input sample/feature vector Let be its label/class Given, the input sample x and the model parameters What is the probability that x belongs to class 1 ? What is the probability that x belongs to class 0 ? Putting together   14

Estimating Model Parameters Likelihood function ( We have n training examples ) Assume that training examples are independent   Apply:   15

Estimating Model Parameters We now have the likelihood function: = What we have to do now? Maximize the likelihood function: = But is complex to manage Alternative solution Maximize instead So, the final likelihood function is =   16

Estimating Model Parameters A pply stochastic gradient ascent (we will see in a moment) Updating parameters using one training sample at a time Gradient ascent: We just need to find out for all j Likelihood function: =   17

Optimization: Gradient Descent 18

Review of Differential Calculus Consider the univariate function y = f(x) (for example: y = ) at x= means rate of change of y w.r.t x < 0 means the function is decreasing > 0 means the function is increasing         What is the sign of here?   What is the sign of here ?   19

Review of Differential Calculus Multivariate function Function containing more than one independent variable Example: y = + + Partial differentiation Vary one variable at a time and keep others fixed 2z,   20

Local and Global minima x F(x) 21

Gradient Descent Goal Minimize a function w.r.t the parameters that the function contains T he error/objective function Two objectives Only minimum value of the function Find the value of the parameters that will minimize the function   Input Output Input Output Training data 22

Gradient Descent Role of gradient descent To find the values of parameters that minimize an objective function F( )   1. If we start from red circle , what is the direction of movement? Ans : downward 2. How can we get that? Ans : by looking at the derivative 3. What is the sign of ? Ans : positive Then, which direction we should move? Ans : - How to change parameter to reach minimum? - r > 0 (scalar) is a parameter called rate constant   we want to reach here   23