Deep learning introduction basic information

AnoopCadlord1 74 views 32 slides Sep 06, 2024
Slide 1
Slide 1 of 32
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32

About This Presentation

Deep learning


Slide Content

Introduction to Deep
Learning
Pabitra Mitra
Indian Institute of Technology Kharagpur
[email protected]
NSM Workshop on Accelerated Data Science

Deep Learning
•Based on neural networks
•Uses deep architectures
•Very successful in many applications

Perceptron
Input
values
weights
Summing
function
Bias
b
Activation
function
Induced
Field
v
Output
y
x
1
x
2
x
m
w
2
w
m
w
1   )(

Neuron Models
●The choice of activation function determines the
neuron model.
Examples:
●step function:
●ramp function:
●sigmoid function with z,x,yparameters
●Gaussian function: 














2
2
1
exp
2
1
)(




v
v )exp(1
1
)(
yxv
zv

 








otherwise ))/())(((
if
if
)(
cdabcva
dvb
cva
v 





cvb
cva
v
if
if
)(

Sigmoid unit
•fis the sigmoid function
•Derivative can be easily computed:
•Logistic equation
•used in many applications
•other functions possible (tanh)
•Single unit:
•apply gradient descent rule
•Multilayer networks: backpropagation
x
1
x
2
x
n
w
1
w
2
w
n
:
:
x
01
w
0


n
i
ii
xw
0
net
S f)net(fo x
e
xf



1
1
)(  )(1)(
)(
xfxf
dx
xdf

5

Multi layer feed-forward NN (FFNN)
●FFNNisamoregeneralnetworkarchitecture,wherethereare
hiddenlayersbetweeninputandoutputlayers.
●Hiddennodesdonotdirectlyreceiveinputsnorsendoutputsto
theexternalenvironment.
●FFNNsovercomethelimitationofsingle-layerNN.
●Theycanhandlenon-linearlyseparablelearningtasks.
Input
layer
Output
layer
Hidden Layer
3-4-2 Network

Backpropagation
•Initialize all weights to small random numbers
•Repeat
For each training example
1.Input the training example to the network and compute the network outputs
2.For each output unit k
d
k← o
k 1 o
k t
ko
k
3.For each hidden unit h
d
h ← o
h 1 o
h S
koutputsw
k,hd
k
4.Update each network weight w
j,i
w
j,i← w
j,i+ Dw
j,i
where Dw
j,i h d
jx
j,i
7

●Data representation
●Network Topology
●Network Parameters
●Training
●Validation
NN DESIGN ISSUES

Expressiveness
•Every bounded continuous function can be
approximated with arbitrarily small error, by
network with one hidden layer (Cybenkoet al ‘89)
•Hidden layer of sigmoid functions
•Output layer of linear functions
•Any function can be approximated to arbitrary
accuracy by a network with two hidden layers
(Cybenko‘88)
•Sigmoid units in both hidden layers
•Output layer of linear functions
9

Choice of Architecture Neural Networks
•Training Set vs Generalization error

Motivation for Depth

Motivation: Mimic the Brain Structure
Feature
Extraction
Learning
Decision
Input Signal
Mid/Low
Level
Neurons
Higher
Brain
Decision
Sensory
Neurons
Arranged
In Coupled
Layers
End-to-End
Neural
Architecture

Motivation
•Practical success in computer vision, signal processing, text mining
•Increase in volume and complexity of data
•Availability of GPUs

Convolutional Neural Network: Motivation

CNN

ResNet
CNN + Skip Connections
Pyramidal cells in cortex

Full ResNetarchitecture:
•Stack residual blocks
•Every residual block has two 3x3 conv layers
•Periodically, double # of filters and
downsamplespatially using stride 2 (in each
dimension)
•Additional conv layer at the beginning
•No FC layers at the end (only FC 1000 to
output classes)

Densenet

Challenges of Depth
•Overfitting –dropout
•Vanishing gradient –ReLUactivation
•Accelerating training –batch normalization
•Hyperparametertuning

Computational Complexity

Types of Deep Architectures
•RNN, LSTM (sequence learning)
•Stacked Autoencoders(representation learning)
•GAN (classification, distribution learning)
•Combining architectures –unified backpropif all layers differentiable
•Tensorflow, PyTorch

References
•Introduction to Deep Learning –Ian Goodfellow
•Stanford Deep Learning course
Tags