Cont…
Q(5,1) = R(5,1) + 0.8*max (Q(1,3), Q(1,5)]
= 0 + 0.8* max(64,100)]
0.8*100 = 80
Atstate3:
Q(1,3) = R(1,3) + 0.8 * max [Q(3,1), Q(3,2), Q(3,4)]
= 0 + 0.8 * max [80, 0, 0]
= 0+ 0.8*80 = 64
Q(4,3) = R(4,3) + 0.8 * max [Q(3,4), Q(3,2), Q(3,1)]
= 0 + 0.8 * max [0, 0, 80]
= 0+ 0.8*80 = 64
Q(2,3) = R(2,3) + 0.8 * max [Q(3,2), Q(3,1), Q(3,4)]
= 0 + 0.8 * max [0, 80, 0]
= 0+ 0.8*80 = 64
At state 4:
Q(5, 4) = R(5, 4) + 0.8 * max [Q(4,5), Q(4,3), Q(4, 0)]
= 0 + 0.8 * max [100, 64, 0]
= 80
8/20/2024 24Dr. Shivashankar, ISE, GAT