Learning Multirobot Hose Transportation and Deployment by Distributed Round-Robin Q-Learning

Fernández Gauna, Borja; Etxeberria Agiriano, Ismael; Graña Romay, Manuel María

dc.contributor.author	Fernández Gauna, Borja
dc.contributor.author	Etxeberria Agiriano, Ismael
dc.contributor.author	Graña Romay, Manuel María
dc.date.accessioned	2016-04-11T13:18:11Z
dc.date.available	2016-04-11T13:18:11Z
dc.date.issued	2015-07-09
dc.identifier.citation	PLOS ONE 10(7) : (2015) // Article ID e0127129	es
dc.identifier.issn	1932-6203
dc.identifier.uri	http://hdl.handle.net/10810/17878
dc.description.abstract	Multi-Agent Reinforcement Learning (MARL) algorithms face two main difficulties: the curse of dimensionality, and environment non-stationarity due to the independent learning processes carried out by the agents concurrently. In this paper we formalize and prove the convergence of a Distributed Round Robin Q-learning (D-RR-QL) algorithm for cooperative systems. The computational complexity of this algorithm increases linearly with the number of agents. Moreover, it eliminates environment non sta tionarity by carrying a round-robin scheduling of the action selection and execution. That this learning scheme allows the implementation of Modular State-Action Vetoes (MSAV) in cooperative multi-agent systems, which speeds up learning convergence in over-constrained systems by vetoing state-action pairs which lead to undesired termination states (UTS) in the relevant state-action subspace. Each agent's local state-action value function learning is an independent process, including the MSAV policies. Coordination of locally optimal policies to obtain the global optimal joint policy is achieved by a greedy selection procedure using message passing. We show that D-RR-QL improves over state-of-the-art approaches, such as Distributed Q-Learning, Team Q-Learning and Coordinated Reinforcement Learning in a paradigmatic Linked Multi-Component Robotic System (L-MCRS) control problem: the hose transportation task. L-MCRS are over-constrained systems with many UTS induced by the interaction of the passive linking element and the active mobile robots.	es
dc.description.sponsorship	This research has been partially funded by EU through SandS project, grant agreement no 317947. This research has been partially funded by grant TIN2011-23823 of the Ministerio de Ciencia e Innovacion of the Spanish Government (MINECO), with FEDER funds. The GIC has been supported by grant IT874-13 as university research group category AMG was supported by EC under FP7, Coordination and Support Action, Grant Agreement Number 316097, ENGINE European Research Centre of Network Intelligence for Innovation Enhancement. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Editorial	es
dc.language.iso	eng	es
dc.publisher	Public Library Science	es
dc.relation	info:eu-repo/grantAgreement/EC/FP7/317947	es
dc.relation	info:eu-repo/grantAgreement/MINECO/TIN2011-23823
dc.rights	info:eu-repo/semantics/openAccess	es
dc.subject	system control	es
dc.subject	reinforcement	es
dc.subject	constraints	es
dc.subject	MDPS	es
dc.title	Learning Multirobot Hose Transportation and Deployment by Distributed Round-Robin Q-Learning	es
dc.type	info:eu-repo/semantics/article	es
dc.rights.holder	© 2015 Fernandez-Gauna et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited	es
dc.relation.publisherversion	http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0127129#abstract0	es
dc.identifier.doi	10.1371/journal.pone.0127129
dc.departamentoes	Ciencia de la computación e inteligencia artificial	es_ES
dc.departamentoes	Lenguajes y sistemas informáticos	es_ES
dc.departamentoeu	Konputazio zientziak eta adimen artifiziala	es_ES
dc.departamentoeu	Hizkuntza eta sistema informatikoak	es_ES
dc.subject.categoria	AGRICULTURAL AND BIOLOGICAL SCIENCES
dc.subject.categoria	MEDICINE
dc.subject.categoria	BIOCHEMISTRY AND MOLECULAR BIOLOGY

Item honetako fitxategiak

Izena:: journal.pone.0127129.PDF
Tamaina:: 633.9Kb
Formatua:: PDF

Ikusi/Ireki

Item hau honako bilduma honetan/hauetan agertzen da

Artikuluak
OpenAire
European Commission

Itemaren erregistro erraza erakusten du