Viterbi algorithm

19,699 views 20 slides Mar 01, 2016
Slide 1
Slide 1 of 20
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20

About This Presentation

Viterbi algorithm


Slide Content

VITERBI ALGORITHM

Introduction The Viterbi Algorithm (VA) was first proposed by Andrew J. Viterbi in 1967 . The Viterbi algorithm is a  dynamic programming algorithm. Use for finding the most  likely sequence of hidden states-called the Viterbi path- that results in a sequence of observed events, especially in the context Hidden Markov Models.

Applications The algorithm has found its original application in communication for decoding such as in dial-up  modems, satellite, deep-space communications and wireless LANs. It is now also commonly used in speech recognition, speech synthesis, natural language processing, computational linguistics and bioinformatics.

Hidden Markov Model Markov models are used to model sequences of events ( or observations ) that occur one after another .  In a Hidden  Markov model, the state is not directly visible, but the output/observations, dependent on the state, is visible. Each state has a probability distribution over the possible output . T he sequence of observations generated by a HMM gives some information about the sequence of states.

An example of Hidden Markov Model (State Diagram) Hidden States Observable Events a ij -> Probability of transition from one state to another b ij -> Probability of an observation for a state a12 a21 a23

The Viterbi Algorithm Input: • The state space S={ s 1 , s 2 , ….. s N } . • The observation space O={ o 1 , 2 , …0 K } . • Transition matrix A of size N.N such that A ij stores the transition probability of transiting from state s i to s j state. • Emission matrix B of size N.K such that B ij stores the probability of observing o j from state s i . • An array of initial probabilities π of size N such that π i stores the probability of state s i at time t= 1 . • Sequence of observations y 1 ,y 2 ….. y T .

Output : T he most likely hidden state sequence  X= {x 1 ,x 2 ,…… x T }. Algorithm : function VITERBI( O, S, π, A, T,B ) : X f or each state s from 1 to N do Viterbi[ s, 1 ] ← π s * B s,o1 Backpointer [ s, 1 ] ← f or each time step t from 2 to T do f or each state s from 1 to N do Viterbi[ s, t ] ← max (Viterbi [ k , t- 1 ] * A k ,s *B s, o t ) Backpointer [ s, t ] ← argmax (Viterbi [ k , t- 1 ] * A k , s * B s , o t ) End for End for k=1 N k=1 N

Z T ← argmax (Viterbi [ s ,T ] ) X T ←S Z T f or i ← T,T-1….2 do Z i-1 ← Backpointer [ Z i , i ] X i-1 ← S Z i-1 End for Return X End function The complexity of the algorithm is O( T * N 2 ) Cont …. S= 1

Implementation Example Consider a doctor diagnoses fever by asking patients how they feel. The patients may only answer that they feel normal, dizzy, or cold . There are two states, "Healthy" and "Fever", but the doctor cannot observe them directly, they are  hidden  from him . On each day, there is a certain chance that the patient will tell the doctor he/she is "normal", "cold", or "dizzy", depending on his/her health condition.

States (S)=‘Healthy’ , ‘Fever’. Observation (O)=‘Normal’ , ‘Cold’ , ‘Dizzy’. Start_probability ( π ) = Healthy: 0.6, Fever: 0.4 Transition Probability(A)= Emission Probability(B)= 0.7 0.3 0.4 0.6 Healthy Healthy Fever Fever 0.5 0.4 0.1 0.1 0.3 0.6 Healthy Fever Normal Cold Dizzy Inputs:

0.6 . 0.5 0.4 . 0.1 Calculate P(start) * P( normal | state) π H . P( Normal | H) π F . P(Normal | F) Operations Start F 0.04 H 0.30 Day 1 Observation Normal

P(H) . P( H ->H) . P(cold | H) P(H) . P(H -> F) . P(cold |F) P(F) . P(F -> H) . P(cold |H) P(F) . P(F -> F) . P(cold | F ) 0.3 . 0.7 .0.4 = 0.084 0.3 . 0.3 . 0.3 = 0.027 0.04 . 0.4 . 0.4 = 0.0064 0.04 . 0.6 . 0.3 = 0.0072 Calculate P( old_state ) * P( old_state -> new_state ) * P( cold | new_state ) Start F 0.04 H 0.30 Day 1 Observation Normal Day 2 Observation Cold

0.084 0.027 0.0064 0.0072 For each State H/F, Select the path with the Highest probability Start F 0.027 H 0.084 F 0.04 H 0.30 Day 1 Observation Normal Day 2 Observation Cold

Calculate P( old_state ) * P( old_state -> new_state ) *P(Dizzy | new_state ) 0.084 . 0.7 . 0.1 = 0.0058 0.084 . 0.3 .0.6 = 0.0151 0.027 . 0.4 . 0.1 = 0.00108 0.027 . 0.6 . 0.6 = 0.00972 Start F 0.027 H 0.084 F 0.04 H 0.30 Day 1 Observation Normal Day 3 Observation Dizzy Day 2 Observation Cold

Start F 0.0151 H 0.0058 F 0.027 H 0.084 F 0.04 H 0.30 Day 1 Observation Normal Day 3 Observation Dizzy 0.0058 0.0151 0.00108 0.00972 Day 2 Observation Cold For each State H/F, Select the path with the Highest probability

Start F 0.0151 H 0.0058 F 0.027 H 0.084 F 0.04 H 0.30 Day 1 Observation Normal Day 3 Observation Dizzy Day 2 Observation Cold For time step T, select the state that has the highest probability and backtrack to the path that produced the highest probability using the backpointer and return the states.

Result Day 1 Observation Normal Day 3 Observation Dizzy Day 2 Observation Cold ( 0.30 ) “HEALTHY” ( 0.084 ) “HEALTHY” ( 0.0151 ) “FEVER”

Advantages A bility to correct wrong bits transmitted by adding redundant information. The State diagram offers a complete description of the system. It is possible to reconstruct lost data. Disadvantages Computation becomes complex for large number of states. More bandwidth needed for redundant information.

CONCLUSION Viterbi algorithm is widely used in communication. Use to find the hidden states of finite states Hidden Markov Model. Also used extensively in recognition problems

THANK YOU