A Modular Neurocontroller for Creative Mobile Autonomous 
		Robots Learning by Temporal Difference
  		
One of the most prominent research goals in the field of mobile autonomous robots 
is to create robots that are able to adapt to new environments, i.e., the robots 
should be able to learn during their lifetime possibly without (or a minimum) of 
human intervention. When employing artificial neural networks (ANNs) to control 
the robot, reinforcement learning (RL) techniques are a good candidate for 
achieving continuous on-line learning. A problem with RL applied to robot 
learning is that the state (and action) space of a robot is typically not discrete. 
Thus, the robot had to evaluate an infinite number of possible actions at every 
time step in order to select the best. To overcome this problem we add a second 
network module to the neurocontroller acting as a memory of previous decisions 
(state-action pairs) of the robot. The robot's actual decisions, then, are based on 
previous decisions retrieved from memory. Additionally, intrinsic noise in the 
memory network gives the robot the possibility to evaluate new ideas, hence it 
becomes creative. We analyze the potential of the above approach by measuring 
the ability of (simulated) robots to learn simple tasks using temporal difference (TD) 
learning.
		
		
		Helmut A. Mayer
		
		Last modified: Feb 1 2005