AIS302-Artificial Neural Networks-Spr24-lec3.pdf

twitchprimegaming112 0 views 33 slides Oct 07, 2025

Slide 1 of 33

About This Presentation

Nlp

Size: 4.87 MB

Language: en

Added: Oct 07, 2025

Slides: 33 pages

Slide Content

AIS302
Artificial Neural
Networks
Spring 24
Ghada Khoriba
Associate Prof. of Artificial Intelligence
1

Agenda
2
References :
Princeton University COS 495 Instructor: Yingyu Liang
https://www.deeplearningbook.org/
Algorithmic Intelligence Laboratory, alinlab.kaist.ac.kr
•Recap
•RNN, Recurrent neural networks

Recurrentneuralnetworks
•Datesbackto(Rumelhartetal.,1986)
•Afamilyofneuralnetworksforhandlingsequentialdata,which
involvesvariable lengthinputsor outputs
•Especially,fornaturallanguageprocessing(NLP)

Sequentialdata
•Eachdatapoint:Asequenceof vectors!("),for1≤$≤%
•Batchdata:manysequenceswithdifferentlengths%
•Label:canbea scalar,a vector,orevena sequence
•Example
•Sentimentanalysis
•Machinetranslation

Example:machinetranslation

Morecomplicatedsequential data
•Datapoint:twodimensionalsequenceslikeimages
•Label:differenttypeofsequencesliketextsentences
•Example:image captioning

Imagecaptioning
Figurefrom thepaper“DenseCap:FullyConvolutionalLocalizationNetworksforDenseCaptioning”,
by JustinJohnson,AndrejKarpathy,LiFei-Fei

4
Sequential Data Problems
Fixed-sized
input
to fixed-sized
output
(e.g. image
classification)
Sequence output
(e.g. image captioning
takes an image and
outputs a sentence of
words).
Sequence input(e.g.
sentiment analysis
where a given sentence
is classified as
expressing positive or
negative sentiment).
Sequence input and
sequence output(e.g.
Machine Translation: an
RNN reads a sentence in
English and then outputs
a sentence in French)
Synced sequence input
and output (e.g. video
classification where we
wish to label each frame
of the video)
Credits: Andrej Karpathy

Atypicaldynamicsystem
!("+1)=#(!";&)
FigurefromDeepLearning,
Goodfellow,BengioandCourville

Asystem driven byexternaldata
!("+1)=#(!",)("+1);&)

Compactview
!("+1)=#(!",)("+1);&)

Compactview
!("+1)=#(!",)("+1);&)
Key:thesame&and '
for alltimesteps
square:onesteptimedelay

Recurrentneuralnetworks
•Usethesamecomputationalfunctionandparametersacrossdifferent
timestepsofthesequence.
•Eachtimestep:takestheinputentryandtheprevioushiddenstateto
computethe outputentry
•Loss:typicallycomputedeverytimestep

Recurrentneuralnetworks
Label
Loss
Output
State
Input

Recurrentneuralnetworks
Mathformula:

Advantage
•Hiddenstate:alossysummaryof thepast
•Sharedfunctionsandparameters:greatlyreducethecapacity and
goodforgeneralizationinlearning
•Explicitly usethepriorknowledgethatthesequentialdatacanbe
processedbyinthesamewayatdifferenttimestep(e.g.,NLP)

Advantage
•Hiddenstate:alossysummaryof thepast
•Sharedfunctionsandparameters:greatlyreducethecapacityand
goodforgeneralizationinlearning
•Explicitly usethepriorknowledgethatthesequentialdatacanbe
processedbyinthesamewayatdifferenttimestep(e.g.,NLP)

TrainingRNN
•Principle:unfoldthecomputationalgraphandusebackpropagation
•Calledback-propagationthroughtime(BPTT)algorithm
•Canthenapplyanygeneral-purposegradient-basedtechniques

TrainingRNN
•Principle:unfoldthecomputationalgraph,andusebackpropagation
•Calledback-propagationthroughtime(BPTT)algorithm
•Canthenapplyanygeneral-purposegradient-basedtechniques
•Conceptually:firstcomputethegradientsoftheinternalnodes,then
computethe gradientsofthe parameters

Recurrentneuralnetworks
Mathformula:
FigurefromDeepLearning,
Goodfellow,BengioandCourville

Recurrentneuralnetworks
Gradientat!("):(total
lossissumofthoseat
differenttimesteps)
FigurefromDeepLearning,
Goodfellow,BengioandCourville

Recurrentneuralnetworks
Gradient at"("):

Recurrentneuralnetworks
Gradientat#($):
is the gradient of the loss L with
respect to the output O at time
step τ
gradients are propagated backwards through time to update the parameters
of the RNN. This allows the RNN to learn temporal dependencies from
sequences of data.

Recurrentneuralnetworks
Gradientat #("):

Recurrentneuralnetworks
Gradientatparameter$:

RNN
•Usethesamecomputationalfunctionandparametersacrossdifferent
timestepsofthesequence
•Eachtimestep:takestheinputentryandtheprevioushiddenstateto
computethe outputentry
•Loss:typicallycomputedeverytimestep
•Manyvariants
•Informationaboutthepastcan beinmany otherforms
•Onlyoutputattheendof thesequence

Example:onlyoutputat the
end

BidirectionalRNNs
•Manyapplications:outputattime!maydependonthewholeinput
sequence
•Exampleinspeechrecognition:correctinterpretationofthecurrent
soundmaydependonthenextfewphonemes,potentiallyeventhe
nextfewwords
•BidirectionalRNNsareintroducedtoaddressthis

AIS302-Artificial Neural Networks-Spr24-lec3.pdf

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

AIS302-Artificial Neural Networks-Spr24-lec3.pdf

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Slide 27

Slide 29

Slide 30

Slide 31

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

TLE-9-Prepare-Salad-and-Dressing.pptxkkk

LESSON 1 ABOUT MEDIA AND INFORMATION.pptx

GRADE-8-AQUACULTURE-WEEKQ1.pdfdfawgwyrsewru

Feelings PP Game FOR CHILDREN IN ELEMENTARY SCHOOL.pptx

Jeopardy_Figures_of_Speech_Template.pptx [Autosaved].pptx

Jeopardy_Figures_of_Speech.pptxvdsvdsvsdvsd