Skip to Main content Skip to Navigation
Book sections

Q-learning for Robots

Abstract : Robot learning is a challenging – and somewhat unique – research domain. If a robot behavior is defined as a mapping between situations that occurred in the real world and actions to be accomplished, then the supervised learning of a robot behavior requires a set of representative examples (situation, desired action). In order to be able to gather such learning base, the human operator must have a deep understanding of the robot-world interaction (i.e., a model). But, there are many application domains where such models cannot be obtained, either because detailed knowledge of the robot’s world is unavailable (e.g., spatial or underwater exploration, nuclear or toxic waste management), or because it would be to costly. In this context, the automatic synthesis of a representative learning base is an important issue. It can be sought using reinforcement learning techniques – in particular Q-learning which does not require a model of the robot-world interaction. Compared to supervised learning, Q-learning examples are triplets (situation, action, Q value), where the Q value is the utility of executing the action in the situation. The supervised learning base is obtained by recruiting the triplets with the highest utility.
Complete list of metadata

Cited literature [13 references]  Display  Hide  Download
Contributor : Claude Touzet Connect in order to contact the contributor
Submitted on : Monday, June 27, 2016 - 5:14:11 PM
Last modification on : Friday, October 22, 2021 - 3:33:01 AM


Files produced by the author(s)


  • HAL Id : hal-01338045, version 1



Claude Touzet. Q-learning for Robots. M. Arbib. The Handbook of Brain Theory and Neural Networks (Second Edition), MIT Press, pp. 934-937, 2003. ⟨hal-01338045⟩



Record views


Files downloads