Cable SCARA Robot Controlled by a Neural Network Using Reinforcement Learning
Autor(a) principal: | |
---|---|
Data de Publicação: | 2023 |
Outros Autores: | , , |
Idioma: | eng |
Título da fonte: | Biblioteca Digital de Teses e Dissertações do INSPER |
Texto Completo: | https://repositorio.insper.edu.br/handle/11224/6676 |
Resumo: | In this work, three reinforcement learning algorithms (Proximal Policy Optimization, Soft Actor-Critic, and Twin Delayed Deep Deterministic Policy Gradient) are employed to control a two link selective compliance articulated robot arm (SCARA) robot. This robot has three cables attached to its end-effector, which creates a triangular shaped workspace. Positioning the end-effector in the workspace is a relatively simple kinematic problem, but moving outside this region, although possible, requires a nonlinear dynamic model and a state-of-the-art controller. To solve this problem in a simple manner, reinforcement learning algorithms are used to find possible trajectories for three targets out of the workspace. Additionally, the SCARA mechanism offers two possible configurations for each end-effector position. The algorithm results are compared in terms of displacement error, velocity, and standard deviation among ten trajectories provided by the trained network. The results indicate the Proximal Policy Algorithm as the most consistent in the analyzed situations. Still, the Soft Actor-Critic presented better solutions, and Twin Delayed Deep Deterministic Policy Gradient provided interesting and more unusual trajectories. |
id |
INSP_354cbbe4dc4e3bfb6b49f458f53b1ca7 |
---|---|
oai_identifier_str |
oai:repositorio.insper.edu.br:11224/6676 |
network_acronym_str |
INSP |
network_name_str |
Biblioteca Digital de Teses e Dissertações do INSPER |
repository_id_str |
|
spelling |
Cable SCARA Robot Controlled by a Neural Network Using Reinforcement LearningAlgorithmsCablesReinforcement learningRobotsArtificial neural networksIn this work, three reinforcement learning algorithms (Proximal Policy Optimization, Soft Actor-Critic, and Twin Delayed Deep Deterministic Policy Gradient) are employed to control a two link selective compliance articulated robot arm (SCARA) robot. This robot has three cables attached to its end-effector, which creates a triangular shaped workspace. Positioning the end-effector in the workspace is a relatively simple kinematic problem, but moving outside this region, although possible, requires a nonlinear dynamic model and a state-of-the-art controller. To solve this problem in a simple manner, reinforcement learning algorithms are used to find possible trajectories for three targets out of the workspace. Additionally, the SCARA mechanism offers two possible configurations for each end-effector position. The algorithm results are compared in terms of displacement error, velocity, and standard deviation among ten trajectories provided by the trained network. The results indicate the Proximal Policy Algorithm as the most consistent in the analyzed situations. Still, the Soft Actor-Critic presented better solutions, and Twin Delayed Deep Deterministic Policy Gradient provided interesting and more unusual trajectories.University of Michigan2024-05-28T16:50:11Z2024-05-28T16:50:11Z2023Digital7 p.image/png1555-14231555-1415https://repositorio.insper.edu.br/handle/11224/667610.1115/1.4063222Journal of Computational and Nonlinear DynamicsOkabe, EduardoPaiva, VictorSilva-Teixeira, Luis H.Izuka, Jaimeinfo:eu-repo/semantics/publishedVersionengreponame:Biblioteca Digital de Teses e Dissertações do INSPERinstname:Instituição de Ensino Superior e de Pesquisa (INSPER)instacron:INSPERinfo:eu-repo/semantics/openAccess2024-05-29T03:01:24Zoai:repositorio.insper.edu.br:11224/6676Biblioteca Digital de Teses e Dissertaçõeshttps://www.insper.edu.br/biblioteca-telles/PRIhttps://repositorio.insper.edu.br/oai/requestbiblioteca@insper.edu.br ||opendoar:2024-05-29T03:01:24Biblioteca Digital de Teses e Dissertações do INSPER - Instituição de Ensino Superior e de Pesquisa (INSPER)false |
dc.title.none.fl_str_mv |
Cable SCARA Robot Controlled by a Neural Network Using Reinforcement Learning |
title |
Cable SCARA Robot Controlled by a Neural Network Using Reinforcement Learning |
spellingShingle |
Cable SCARA Robot Controlled by a Neural Network Using Reinforcement Learning Okabe, Eduardo Algorithms Cables Reinforcement learning Robots Artificial neural networks |
title_short |
Cable SCARA Robot Controlled by a Neural Network Using Reinforcement Learning |
title_full |
Cable SCARA Robot Controlled by a Neural Network Using Reinforcement Learning |
title_fullStr |
Cable SCARA Robot Controlled by a Neural Network Using Reinforcement Learning |
title_full_unstemmed |
Cable SCARA Robot Controlled by a Neural Network Using Reinforcement Learning |
title_sort |
Cable SCARA Robot Controlled by a Neural Network Using Reinforcement Learning |
author |
Okabe, Eduardo |
author_facet |
Okabe, Eduardo Paiva, Victor Silva-Teixeira, Luis H. Izuka, Jaime |
author_role |
author |
author2 |
Paiva, Victor Silva-Teixeira, Luis H. Izuka, Jaime |
author2_role |
author author author |
dc.contributor.author.fl_str_mv |
Okabe, Eduardo Paiva, Victor Silva-Teixeira, Luis H. Izuka, Jaime |
dc.subject.por.fl_str_mv |
Algorithms Cables Reinforcement learning Robots Artificial neural networks |
topic |
Algorithms Cables Reinforcement learning Robots Artificial neural networks |
description |
In this work, three reinforcement learning algorithms (Proximal Policy Optimization, Soft Actor-Critic, and Twin Delayed Deep Deterministic Policy Gradient) are employed to control a two link selective compliance articulated robot arm (SCARA) robot. This robot has three cables attached to its end-effector, which creates a triangular shaped workspace. Positioning the end-effector in the workspace is a relatively simple kinematic problem, but moving outside this region, although possible, requires a nonlinear dynamic model and a state-of-the-art controller. To solve this problem in a simple manner, reinforcement learning algorithms are used to find possible trajectories for three targets out of the workspace. Additionally, the SCARA mechanism offers two possible configurations for each end-effector position. The algorithm results are compared in terms of displacement error, velocity, and standard deviation among ten trajectories provided by the trained network. The results indicate the Proximal Policy Algorithm as the most consistent in the analyzed situations. Still, the Soft Actor-Critic presented better solutions, and Twin Delayed Deep Deterministic Policy Gradient provided interesting and more unusual trajectories. |
publishDate |
2023 |
dc.date.none.fl_str_mv |
2023 2024-05-28T16:50:11Z 2024-05-28T16:50:11Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
1555-1423 1555-1415 https://repositorio.insper.edu.br/handle/11224/6676 10.1115/1.4063222 |
identifier_str_mv |
1555-1423 1555-1415 10.1115/1.4063222 |
url |
https://repositorio.insper.edu.br/handle/11224/6676 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
Journal of Computational and Nonlinear Dynamics |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
Digital 7 p. image/png |
dc.publisher.none.fl_str_mv |
University of Michigan |
publisher.none.fl_str_mv |
University of Michigan |
dc.source.none.fl_str_mv |
reponame:Biblioteca Digital de Teses e Dissertações do INSPER instname:Instituição de Ensino Superior e de Pesquisa (INSPER) instacron:INSPER |
instname_str |
Instituição de Ensino Superior e de Pesquisa (INSPER) |
instacron_str |
INSPER |
institution |
INSPER |
reponame_str |
Biblioteca Digital de Teses e Dissertações do INSPER |
collection |
Biblioteca Digital de Teses e Dissertações do INSPER |
repository.name.fl_str_mv |
Biblioteca Digital de Teses e Dissertações do INSPER - Instituição de Ensino Superior e de Pesquisa (INSPER) |
repository.mail.fl_str_mv |
biblioteca@insper.edu.br || |
_version_ |
1814986265618022400 |