Skip to Main content Skip to Navigation
Journal articles

Distributed Lazy Q-learning for Cooperative Mobile Robots

Abstract : Compared to single robot learning, cooperative learning adds the challenge of a much larger search space (combined individual search spaces), awareness of other team members, and also the synthesis of the individual behaviors with respect to the task given to the group. Over the years, reinforcement learning has emerged as the main learning approach in autonomous robotics, and lazy learning has become the leading bias, allowing the reduction of the time required by an experiment to the time needed to test the learned behavior performance. These two approaches have been combined together in what is now called lazy Q-learning, a very efficient single robot learning paradigm. We propose a derivation of this learning to team of robots : the «pessimistic» algorithm able to compute for each team member a lower bound of the utility of executing an action in a given situation. We use the cooperative multi-robot observation of multiple moving targets (CMOMMT) application as an illustrative example, and study the efficiency of the Pessimistic Algorithm in its task of inducing learning of cooperation.
Complete list of metadatas

Cited literature [19 references]  Display  Hide  Download

https://hal-amu.archives-ouvertes.fr/hal-01337605
Contributor : Claude Touzet <>
Submitted on : Monday, June 27, 2016 - 4:03:37 PM
Last modification on : Thursday, January 18, 2018 - 1:42:37 AM
Long-term archiving on: : Wednesday, September 28, 2016 - 11:16:44 AM

File

Distributed_lazy_Q.pdf
Publisher files allowed on an open archive

Identifiers

Collections

Citation

Claude Touzet. Distributed Lazy Q-learning for Cooperative Mobile Robots. International Journal of Advanced Robotic Systems, InTech, 2004, 1, pp.5-13. ⟨10.5772/5614⟩. ⟨hal-01337605⟩

Share

Metrics

Record views

138

Files downloads

240