Learning positioning policies for mobile manipulation operations with deep reinforcement learning
dc.contributor.author | Iriondo Azpiri, Ander | |
dc.contributor.author | Lazkano Ortega, Elena | |
dc.contributor.author | Ansuategi Cobo, Ander | |
dc.contributor.author | Rivera Pinto, Andoni | |
dc.contributor.author | Lluvia Hermosilla, Iker | |
dc.contributor.author | Tubío Otero, Carlos | |
dc.date.accessioned | 2024-04-29T18:22:24Z | |
dc.date.available | 2024-04-29T18:22:24Z | |
dc.date.issued | 2023 | |
dc.identifier.citation | International Journal of Machine Learning and Cybernetics 14 : 3003-3023 (2023) | es_ES |
dc.identifier.issn | 1868-808X | |
dc.identifier.issn | 1868-8071 | |
dc.identifier.uri | http://hdl.handle.net/10810/66934 | |
dc.description.abstract | This work focuses on the operation of picking an object on a table with a mobile manipulator. We use deep reinforcement learning (DRL) to learn a positioning policy for the robot’s base by considering the reachability constraints of the arm. This work extends our first proof-of-concept with the ultimate goal of validating the method on a real robot. Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm is used to model the base controller, and is optimised using the feedback from the MoveIt! based arm planner. The idea is to encourage the base controller to position itself in areas where the arm reaches the object. Following a simulation-to-reality approach, first we create a realistic simulation of the robotic environment in Unity, and integrate it in Robot Operating System (ROS). The drivers for both the base and the arm are also implemented. The DRL-based agent is trained in simulation and, both the robot and target poses are randomised to make the learnt base controller robust to uncertainties. We propose a task-specific setup for TD3, which includes state/action spaces, reward func- tion and neural architectures. We compare the proposed method with the baseline work and show that the combination of TD3 and the proposed setup leads to a 11% higher success rate than with the baseline, with an overall success rate of 97%. Finally, the learnt agent is deployed and validated in the real robotic system where we obtain a promising success rate of 75% | es_ES |
dc.description.sponsorship | This publication has been funded by the Basque Government - Department of Economic Development, Sustainability and Environment - Aid program for collaborative research in strategic areas - ELKARTEK 2021 Program (File KK-2021/00033 TREBEZIA), and the project “5R- Red Cervera de Tecnologías robóticas en fabricación inteligente”, contract number CER-20211007, under “Centros Tecnológicos de Excelencia Cervera” programme funded by “The Centre for the Development of Industrial Technology (CDTI)”. | es_ES |
dc.language.iso | eng | es_ES |
dc.publisher | Springer Nature | es_ES |
dc.rights | info:eu-repo/semantics/openAccess | es_ES |
dc.rights.uri | http://creativecommons.org/licenses/by/3.0/es/ | * |
dc.subject | mobile manipulation | es_ES |
dc.subject | pick and place | es_ES |
dc.subject | deep reinforcement learning | es_ES |
dc.subject | sim-to-real transfer | es_ES |
dc.title | Learning positioning policies for mobile manipulation operations with deep reinforcement learning | es_ES |
dc.type | info:eu-repo/semantics/article | es_ES |
dc.rights.holder | © The Author(s) 2023. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. | es_ES |
dc.rights.holder | Atribución 3.0 España | * |
dc.relation.publisherversion | https://link.springer.com/article/10.1007/s13042-023-01815-8 | es_ES |
dc.identifier.doi | 10.1007/s13042-023-01815-8 | |
dc.departamentoes | Ciencia de la computación e inteligencia artificial | es_ES |
dc.departamentoeu | Konputazio zientziak eta adimen artifiziala | es_ES |
Files in this item
This item appears in the following Collection(s)
Except where otherwise noted, this item's license is described as © The Author(s) 2023. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.