a) A simple reflex agent acts only on the current percept, ignoring the rest of the percept history. For it to be perfectly rational,
   it should give the best outcome or when there is uncertainty, the best expected outcome.
   This means the performance measure doesnot depend on the states the environment was in the past.
   For example, a bot playing chess, it requires only the present state to make the next move with the only motive of "winning".
   Other examples may involve a bot solving a rubix cube.

b) The greedy robot in B is indeterministic. However, if one has the algorithm that generates the random number which decides the
   move of the bot when choices are equal, we can determine its move.

c) There is a sequence of actions that will give the best performance measure.
   The randomised agent may get "random numbers" that may lead it to follow these exact steps that will maximise its performance
   measure. So, there is always a little bit of chance that the agent may end up giving best performance, though it may be performing
   poorly with a pretty high probability.

d) Suppose the rational agent is the greedy bot in exercise B.
   It sees the dirt value "x" in one of the grids, so the dirt may lie in the range (a,b) where a is s.t. a+0.2a =x , and b-0.8b=x.
   In this case we can sort according to these ranges and incase of overlaps, we can use random numbers to resolve the conflict.

e) Again suppose the rational agent is the greedy bot in exercise B.
   It take nth step. Now, if "clean did not become dirty" again, we would have to compare for only 3 values in case of greedy approach
   because from the grid the bot came from, its dirt value would have been zero.
   But, in this case the dirt value may not be zero.
   Infact, we can take dirt value to be = (Expected value of dirt seen so far)*(probability of the grid getting dirty again)
   where probability of the grid getting dirty again can be calculated using the guassian distribution, it is equal to
   = (probability of the grid getting dirty in "n" moves - probability of the grid getting dirty in "n-1" moves).