A Modular Neurocontroller for Creative Mobile Autonomous
Robots Learning by Temporal Difference
One of the most prominent research goals in the field of mobile autonomous robots
is to create robots that are able to adapt to new environments, i.e., the robots
should be able to learn during their lifetime possibly without (or a minimum) of
human intervention. When employing artificial neural networks (ANNs) to control
the robot, reinforcement learning (RL) techniques are a good candidate for
achieving continuous on-line learning. A problem with RL applied to robot
learning is that the state (and action) space of a robot is typically not discrete.
Thus, the robot had to evaluate an infinite number of possible actions at every
time step in order to select the best. To overcome this problem we add a second
network module to the neurocontroller acting as a memory of previous decisions
(state-action pairs) of the robot. The robot's actual decisions, then, are based on
previous decisions retrieved from memory. Additionally, intrinsic noise in the
memory network gives the robot the possibility to evaluate new ideas, hence it
becomes creative. We analyze the potential of the above approach by measuring
the ability of (simulated) robots to learn simple tasks using temporal difference (TD)
learning.
Helmut A. Mayer
Last modified: Feb 1 2005