Show simple item record

dc.contributor.authorCabezas Olivenza, Mireya
dc.contributor.authorZulueta Guerrero, Ekaitz
dc.contributor.authorSánchez Chica, Ander
dc.contributor.authorFernández Gámiz, Unai
dc.contributor.authorTeso Fernández de Betoño, Adrián ORCID
dc.date.accessioned2023-01-12T14:57:04Z
dc.date.available2023-01-12T14:57:04Z
dc.date.issued2022-12-27
dc.identifier.citationMathematics 11(1) : (2023) // Article ID 132es_ES
dc.identifier.issn2227-7390
dc.identifier.urihttp://hdl.handle.net/10810/59262
dc.description.abstractThe Deep Deterministic Policy Gradient (DDPG) algorithm is a reinforcement learning algorithm that combines Q-learning with a policy. Nevertheless, this algorithm generates failures that are not well understood. Rather than looking for those errors, this study presents a way to evaluate the suitability of the results obtained. Using the purpose of autonomous vehicle navigation, the DDPG algorithm is applied, obtaining an agent capable of generating trajectories. This agent is evaluated in terms of stability through the Lyapunov function, verifying if the proposed navigation objectives are achieved. The reward function of the DDPG is used because it is unknown if the neural networks of the actor and the critic are correctly trained. Two agents are obtained, and a comparison is performed between them in terms of stability, demonstrating that the Lyapunov function can be used as an evaluation method for agents obtained by the DDPG algorithm. Verifying the stability at a fixed future horizon, it is possible to determine whether the obtained agent is valid and can be used as a vehicle controller, so a task-satisfaction assessment can be performed. Furthermore, the proposed analysis is an indication of which parts of the navigation area are insufficient in training terms.es_ES
dc.description.sponsorshipThe current study has been sponsored by the Government of the Basque Country-ELKARTEK21/10 KK-2021/00014 research program “Estudio de nuevas técnicas de inteligencia artificial basadas en Deep Learning dirigidas a la optimización de procesos industrials”.es_ES
dc.language.isoenges_ES
dc.publisherMDPIes_ES
dc.rightsinfo:eu-repo/semantics/openAccesses_ES
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/
dc.subjectnavigationes_ES
dc.subjectneural networkes_ES
dc.subjectautonomous vehiclees_ES
dc.subjectreinforcement learninges_ES
dc.subjectDDPGes_ES
dc.subjectlyapunoves_ES
dc.subjectstabilityes_ES
dc.subjectq-learninges_ES
dc.titleStability Analysis for Autonomous Vehicle Navigation Trained over Deep Deterministic Policy Gradientes_ES
dc.typeinfo:eu-repo/semantics/articlees_ES
dc.date.updated2023-01-06T13:52:45Z
dc.rights.holder© 2022 by the authors.Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/ 4.0/).es_ES
dc.relation.publisherversionhttps://www.mdpi.com/2227-7390/11/1/132es_ES
dc.identifier.doi10.3390/math11010132
dc.departamentoesIngeniería de sistemas y automática
dc.departamentoesIngeniería Energética
dc.departamentoeuSistemen ingeniaritza eta automatika
dc.departamentoeuEnergia Ingenieritza


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

© 2022 by the authors.Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/ 4.0/).
Except where otherwise noted, this item's license is described as © 2022 by the authors.Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/ 4.0/).