Cable SCARA Robot Controlled by a Neural Network Using Reinforcement Learning

Detalhes bibliográficos
Autor(a) principal: Okabe, Eduardo
Data de Publicação: 2023
Outros Autores: Paiva, Victor, Silva-Teixeira, Luis H., Izuka, Jaime
Idioma: eng
Título da fonte: Biblioteca Digital de Teses e Dissertações do INSPER
Texto Completo: https://repositorio.insper.edu.br/handle/11224/6676
Resumo: In this work, three reinforcement learning algorithms (Proximal Policy Optimization, Soft Actor-Critic, and Twin Delayed Deep Deterministic Policy Gradient) are employed to control a two link selective compliance articulated robot arm (SCARA) robot. This robot has three cables attached to its end-effector, which creates a triangular shaped workspace. Positioning the end-effector in the workspace is a relatively simple kinematic problem, but moving outside this region, although possible, requires a nonlinear dynamic model and a state-of-the-art controller. To solve this problem in a simple manner, reinforcement learning algorithms are used to find possible trajectories for three targets out of the workspace. Additionally, the SCARA mechanism offers two possible configurations for each end-effector position. The algorithm results are compared in terms of displacement error, velocity, and standard deviation among ten trajectories provided by the trained network. The results indicate the Proximal Policy Algorithm as the most consistent in the analyzed situations. Still, the Soft Actor-Critic presented better solutions, and Twin Delayed Deep Deterministic Policy Gradient provided interesting and more unusual trajectories.
id INSP_354cbbe4dc4e3bfb6b49f458f53b1ca7
oai_identifier_str oai:repositorio.insper.edu.br:11224/6676
network_acronym_str INSP
network_name_str Biblioteca Digital de Teses e Dissertações do INSPER
repository_id_str
spelling Cable SCARA Robot Controlled by a Neural Network Using Reinforcement LearningAlgorithmsCablesReinforcement learningRobotsArtificial neural networksIn this work, three reinforcement learning algorithms (Proximal Policy Optimization, Soft Actor-Critic, and Twin Delayed Deep Deterministic Policy Gradient) are employed to control a two link selective compliance articulated robot arm (SCARA) robot. This robot has three cables attached to its end-effector, which creates a triangular shaped workspace. Positioning the end-effector in the workspace is a relatively simple kinematic problem, but moving outside this region, although possible, requires a nonlinear dynamic model and a state-of-the-art controller. To solve this problem in a simple manner, reinforcement learning algorithms are used to find possible trajectories for three targets out of the workspace. Additionally, the SCARA mechanism offers two possible configurations for each end-effector position. The algorithm results are compared in terms of displacement error, velocity, and standard deviation among ten trajectories provided by the trained network. The results indicate the Proximal Policy Algorithm as the most consistent in the analyzed situations. Still, the Soft Actor-Critic presented better solutions, and Twin Delayed Deep Deterministic Policy Gradient provided interesting and more unusual trajectories.University of Michigan2024-05-28T16:50:11Z2024-05-28T16:50:11Z2023Digital7 p.image/png1555-14231555-1415https://repositorio.insper.edu.br/handle/11224/667610.1115/1.4063222Journal of Computational and Nonlinear DynamicsOkabe, EduardoPaiva, VictorSilva-Teixeira, Luis H.Izuka, Jaimeinfo:eu-repo/semantics/publishedVersionengreponame:Biblioteca Digital de Teses e Dissertações do INSPERinstname:Instituição de Ensino Superior e de Pesquisa (INSPER)instacron:INSPERinfo:eu-repo/semantics/openAccess2024-05-29T03:01:24Zoai:repositorio.insper.edu.br:11224/6676Biblioteca Digital de Teses e Dissertaçõeshttps://www.insper.edu.br/biblioteca-telles/PRIhttps://repositorio.insper.edu.br/oai/requestbiblioteca@insper.edu.br ||opendoar:2024-05-29T03:01:24Biblioteca Digital de Teses e Dissertações do INSPER - Instituição de Ensino Superior e de Pesquisa (INSPER)false
dc.title.none.fl_str_mv Cable SCARA Robot Controlled by a Neural Network Using Reinforcement Learning
title Cable SCARA Robot Controlled by a Neural Network Using Reinforcement Learning
spellingShingle Cable SCARA Robot Controlled by a Neural Network Using Reinforcement Learning
Okabe, Eduardo
Algorithms
Cables
Reinforcement learning
Robots
Artificial neural networks
title_short Cable SCARA Robot Controlled by a Neural Network Using Reinforcement Learning
title_full Cable SCARA Robot Controlled by a Neural Network Using Reinforcement Learning
title_fullStr Cable SCARA Robot Controlled by a Neural Network Using Reinforcement Learning
title_full_unstemmed Cable SCARA Robot Controlled by a Neural Network Using Reinforcement Learning
title_sort Cable SCARA Robot Controlled by a Neural Network Using Reinforcement Learning
author Okabe, Eduardo
author_facet Okabe, Eduardo
Paiva, Victor
Silva-Teixeira, Luis H.
Izuka, Jaime
author_role author
author2 Paiva, Victor
Silva-Teixeira, Luis H.
Izuka, Jaime
author2_role author
author
author
dc.contributor.author.fl_str_mv Okabe, Eduardo
Paiva, Victor
Silva-Teixeira, Luis H.
Izuka, Jaime
dc.subject.por.fl_str_mv Algorithms
Cables
Reinforcement learning
Robots
Artificial neural networks
topic Algorithms
Cables
Reinforcement learning
Robots
Artificial neural networks
description In this work, three reinforcement learning algorithms (Proximal Policy Optimization, Soft Actor-Critic, and Twin Delayed Deep Deterministic Policy Gradient) are employed to control a two link selective compliance articulated robot arm (SCARA) robot. This robot has three cables attached to its end-effector, which creates a triangular shaped workspace. Positioning the end-effector in the workspace is a relatively simple kinematic problem, but moving outside this region, although possible, requires a nonlinear dynamic model and a state-of-the-art controller. To solve this problem in a simple manner, reinforcement learning algorithms are used to find possible trajectories for three targets out of the workspace. Additionally, the SCARA mechanism offers two possible configurations for each end-effector position. The algorithm results are compared in terms of displacement error, velocity, and standard deviation among ten trajectories provided by the trained network. The results indicate the Proximal Policy Algorithm as the most consistent in the analyzed situations. Still, the Soft Actor-Critic presented better solutions, and Twin Delayed Deep Deterministic Policy Gradient provided interesting and more unusual trajectories.
publishDate 2023
dc.date.none.fl_str_mv 2023
2024-05-28T16:50:11Z
2024-05-28T16:50:11Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
status_str publishedVersion
dc.identifier.uri.fl_str_mv 1555-1423
1555-1415
https://repositorio.insper.edu.br/handle/11224/6676
10.1115/1.4063222
identifier_str_mv 1555-1423
1555-1415
10.1115/1.4063222
url https://repositorio.insper.edu.br/handle/11224/6676
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv Journal of Computational and Nonlinear Dynamics
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv Digital
7 p.
image/png
dc.publisher.none.fl_str_mv University of Michigan
publisher.none.fl_str_mv University of Michigan
dc.source.none.fl_str_mv reponame:Biblioteca Digital de Teses e Dissertações do INSPER
instname:Instituição de Ensino Superior e de Pesquisa (INSPER)
instacron:INSPER
instname_str Instituição de Ensino Superior e de Pesquisa (INSPER)
instacron_str INSPER
institution INSPER
reponame_str Biblioteca Digital de Teses e Dissertações do INSPER
collection Biblioteca Digital de Teses e Dissertações do INSPER
repository.name.fl_str_mv Biblioteca Digital de Teses e Dissertações do INSPER - Instituição de Ensino Superior e de Pesquisa (INSPER)
repository.mail.fl_str_mv biblioteca@insper.edu.br ||
_version_ 1814986265618022400