Model of Reinforcement Learning in the Mouse Reaching and Grasping Experiment
The Mouse Reaching and Grasping Performance Scale (MoRaG) experiment is aimed at observing the effect of certain gene mutations on the cognitive abilities of the mice. In this experiment a mouse is placed in a transparent cage with an opening so small that food pellet placed on the other side cannot be reached by the mouse by nose poke but only by hand-reach. The main observation noted down during the MoRaG experiment was the number of nose pokes performed by each mutant mouse in an attempt to reach the target food pellet. This nose-poking action was followed by the hand-reach action which eventually led the mice towards successful retrieval of the food pellet. The sequence and number of nose-pokes and hand-reach actions performed by the animal during the MoRaG experiment is a result of two parameters: (i) speed of learning and (ii) amount of preference for exploratory behaviour. To enable the easy quantification and analysis of these two parameters a computational model was built. The model assumes that the mouse selects one of two actions: handreach or nose-poke, and each action is associated with a weight determining the probability of its selection, which is updated according to the Rescorla Wagner rule. For each type of mutant mice used during the MoRaG experiment the two parameters (describing speed of learning and preference for exploration) have been estimated using maximum likelihood method. The model was able to replicate the behaviour of the mice as observed during the MoRaG experiment and to quantify the cognitive abilities of the mutant mice successfully. Thereby helping the scientist involved with the MoRaG experiment to assess the effect the genetic mutation had on the mice.