-
This plot shows the performance of learning algorithm
for 1 serve over 1,000 trials
-
Here weights have been dumped and loaded in the successive
run to tune the net
-
The number of iterations here are 3
-
The blue and pink lines are the final curves in the
3rd iteration.
-
The red and green lines are in the first iteration
-
Plota where
-
a=1 refers to successful shots => reaching the
ball
-
a=2 refers to completely successful shots =>
hitting a correct shot as well
-
Here
-
Gamma = 0.9
-
Lambda = 0.95
-
Reward Function = 2
for complete success
= 1 for partial success
= -1 for failure
= 0 otherwise