Autonomous trigger-task-sequencing system for Human- Robot Collaborative assembly: A Deep Learning approach on Visual Task Recognition

Garcia, Pedro Miguel Rodrigues Pinto

Autonomous trigger-task-sequencing system for Human- Robot Collaborative assembly: A Deep Learning approach on Visual Task Recognition

Detalhes bibliográficos
Autor(a) principal:	Garcia, Pedro Miguel Rodrigues Pinto
Data de Publicação:	2021
Tipo de documento:	Dissertação
Idioma:	eng
Título da fonte:	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo:	http://hdl.handle.net/10362/138621
Resumo:	Human-robot collaborative (HRC) assembly environments are becoming increasingly important as the paradigm of manufacturing is shifting from mass production towards mass customization, and the introduction of a HRC system could significantly improve the flexibility and intelligence of automation. To efficiently accomplish tasks in HRC assembly environments, a robot needs to be able to understand its surroundings by detecting, recognizing, and locating the presence of humans and targeted objects, as well as to have a higher level of understanding beyond what is being currently seen, i.e., to fully understand a sequence of operations and tasks related to each step of the assembly process. Within the scope of the Internet of Things (IoT) and Artificial Intelligence (AI), the most widely used framework to enable robot vision (or vision in other type of cyber-physical systems) is deep learning, precisely by means of deep learning algorithms based on Convolutional Neural Networks (CNNs). Hence, in this study a system capable of managing a HRC assembly process of a mechanical component was developed, where computer vision was enabled by CNN-based methods and the assembly task recognition made from solely visual RGB data of the components in working space. A model for components’ detection was firstly developed by comparing two main CNN-based framework branches – the Faster Region-based Convolutional Neural Network (Faster R-CNN) and the You Only Look Once (YOLO). Then, a system that correlated the current state of the working space – i.e., whether certain components were already correctly assembled – with the progression of the assembly sequence was developed with the best-performing 11 class object detector, the YOLOv3. This framework was the only capable of detecting a small object classes in comparison with the other benchmarked frameworks and presented a mean average precision (mAP) of 72.26% over the test dataset. Using YOLOv3 as the computer vision-enabling framework, a successful and efficient HRC industrial demonstrator was created.

Metadados do item

id	RCAP_d9600c8c8bf1b8947c0b6a93ad8239e4
oai_identifier_str	oai:run.unl.pt:10362/138621
network_acronym_str	RCAP
network_name_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str	7160
spelling	Autonomous trigger-task-sequencing system for Human- Robot Collaborative assembly: A Deep Learning approach on Visual Task RecognitionVisual assembly task recognitionHuman-Robot Collaborative assemblyOnline Object DetectionYOLOFaster R-CNNDomínio/Área Científica::Engenharia e Tecnologia::Engenharia MecânicaHuman-robot collaborative (HRC) assembly environments are becoming increasingly important as the paradigm of manufacturing is shifting from mass production towards mass customization, and the introduction of a HRC system could significantly improve the flexibility and intelligence of automation. To efficiently accomplish tasks in HRC assembly environments, a robot needs to be able to understand its surroundings by detecting, recognizing, and locating the presence of humans and targeted objects, as well as to have a higher level of understanding beyond what is being currently seen, i.e., to fully understand a sequence of operations and tasks related to each step of the assembly process. Within the scope of the Internet of Things (IoT) and Artificial Intelligence (AI), the most widely used framework to enable robot vision (or vision in other type of cyber-physical systems) is deep learning, precisely by means of deep learning algorithms based on Convolutional Neural Networks (CNNs). Hence, in this study a system capable of managing a HRC assembly process of a mechanical component was developed, where computer vision was enabled by CNN-based methods and the assembly task recognition made from solely visual RGB data of the components in working space. A model for components’ detection was firstly developed by comparing two main CNN-based framework branches – the Faster Region-based Convolutional Neural Network (Faster R-CNN) and the You Only Look Once (YOLO). Then, a system that correlated the current state of the working space – i.e., whether certain components were already correctly assembled – with the progression of the assembly sequence was developed with the best-performing 11 class object detector, the YOLOv3. This framework was the only capable of detecting a small object classes in comparison with the other benchmarked frameworks and presented a mean average precision (mAP) of 72.26% over the test dataset. Using YOLOv3 as the computer vision-enabling framework, a successful and efficient HRC industrial demonstrator was created.Os ambientes de colaboração homem-robô (CHR) têm vindo a ganhar relevância à medida que o paradigma do fabrico industrial se vindo a alterar da tradicional produção em massa e em grandes volumes, para uma tipologia de fabrico mais customizada. A introdução da CHR pode aumentar significativamente a flexibilidade e grau de inteligência dos sistemas de automação. Para realizar eficientemente as tarefas pré-definidas em ambientes de CHR, os robôs necessitam de ter a capacidade de compreender o seu ambiente envolvente e de detetar, reconhecer e localizar a presença de humanos e de objetos chave, mas também de realizar associações de alto-nível, i.e., compreender em detalhe e processar a sequência de ações necessárias para realizar cada uma das tarefas de montagem pré-definidas. No âmbito da Internet das Coisas (IoC) e da Inteligência Artificial (IA), a estrutura que atualmente mais se utiliza para tornar a visão computacional uma realidade é o deep learning - Aprendizagem Profunda (AP) – nomeadamente os algoritmos de AP baseados na arquitetura das Convolutional Neural Networks (CNNs). No presente estudo, um sistema capaz de gerir todo o processo de montagem em CHR de um componente mecânico foi desenvolvido, onde a visão computacional deste sistema foi habilitada por algoritmos em arquiteturas CNN, sendo o reconhecimento visual das operações de montagem realizado com base em imagens dos componentes dispersos pelo espaço de trabalho adquiridas por uma câmara de vídeo. Inicialmente, um modelo para deteção de componentes foi desenvolvido comparando-se nesta etapa dois ramos de CNNs comumente utilizados para a deteção de objetos – as Faster Region-based Convolutional Neural Networks (Faster R-CNN) e as You Only Look Once (YOLO). De seguida, um sistema que correlacionava o estado atual do espaço de trabalho – i.e., quais componentes já se encontrariam montados – com o andamento da sequência de montagem foi desenvolvido com o modelo que apresentou os melhores resultados, o YOLOv3. Este modelo foi o único capaz de detetar a classe de objetos de dimensões mais reduzidas em comparação com os outros modelos e apresentou uma mean average precision (mAP) de 72,26% no dataset de teste. Por fim, um eficiente demonstrador industrial da montagem de todo o processo em CHR foi desenvolvido e implementado com sucesso.Mendes, NunoRUNGarcia, Pedro Miguel Rodrigues Pinto2022-05-25T16:12:46Z2021-122021-12-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10362/138621enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-03-11T05:15:57Zoai:run.unl.pt:10362/138621Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:49:09.572073Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv	Autonomous trigger-task-sequencing system for Human- Robot Collaborative assembly: A Deep Learning approach on Visual Task Recognition
title	Autonomous trigger-task-sequencing system for Human- Robot Collaborative assembly: A Deep Learning approach on Visual Task Recognition
spellingShingle	Autonomous trigger-task-sequencing system for Human- Robot Collaborative assembly: A Deep Learning approach on Visual Task Recognition Garcia, Pedro Miguel Rodrigues Pinto Visual assembly task recognition Human-Robot Collaborative assembly Online Object Detection YOLO Faster R-CNN Domínio/Área Científica::Engenharia e Tecnologia::Engenharia Mecânica
title_short	Autonomous trigger-task-sequencing system for Human- Robot Collaborative assembly: A Deep Learning approach on Visual Task Recognition
title_full	Autonomous trigger-task-sequencing system for Human- Robot Collaborative assembly: A Deep Learning approach on Visual Task Recognition
title_fullStr	Autonomous trigger-task-sequencing system for Human- Robot Collaborative assembly: A Deep Learning approach on Visual Task Recognition
title_full_unstemmed	Autonomous trigger-task-sequencing system for Human- Robot Collaborative assembly: A Deep Learning approach on Visual Task Recognition
title_sort	Autonomous trigger-task-sequencing system for Human- Robot Collaborative assembly: A Deep Learning approach on Visual Task Recognition
author	Garcia, Pedro Miguel Rodrigues Pinto
author_facet	Garcia, Pedro Miguel Rodrigues Pinto
author_role	author
dc.contributor.none.fl_str_mv	Mendes, Nuno RUN
dc.contributor.author.fl_str_mv	Garcia, Pedro Miguel Rodrigues Pinto
dc.subject.por.fl_str_mv	Visual assembly task recognition Human-Robot Collaborative assembly Online Object Detection YOLO Faster R-CNN Domínio/Área Científica::Engenharia e Tecnologia::Engenharia Mecânica
topic	Visual assembly task recognition Human-Robot Collaborative assembly Online Object Detection YOLO Faster R-CNN Domínio/Área Científica::Engenharia e Tecnologia::Engenharia Mecânica
description	Human-robot collaborative (HRC) assembly environments are becoming increasingly important as the paradigm of manufacturing is shifting from mass production towards mass customization, and the introduction of a HRC system could significantly improve the flexibility and intelligence of automation. To efficiently accomplish tasks in HRC assembly environments, a robot needs to be able to understand its surroundings by detecting, recognizing, and locating the presence of humans and targeted objects, as well as to have a higher level of understanding beyond what is being currently seen, i.e., to fully understand a sequence of operations and tasks related to each step of the assembly process. Within the scope of the Internet of Things (IoT) and Artificial Intelligence (AI), the most widely used framework to enable robot vision (or vision in other type of cyber-physical systems) is deep learning, precisely by means of deep learning algorithms based on Convolutional Neural Networks (CNNs). Hence, in this study a system capable of managing a HRC assembly process of a mechanical component was developed, where computer vision was enabled by CNN-based methods and the assembly task recognition made from solely visual RGB data of the components in working space. A model for components’ detection was firstly developed by comparing two main CNN-based framework branches – the Faster Region-based Convolutional Neural Network (Faster R-CNN) and the You Only Look Once (YOLO). Then, a system that correlated the current state of the working space – i.e., whether certain components were already correctly assembled – with the progression of the assembly sequence was developed with the best-performing 11 class object detector, the YOLOv3. This framework was the only capable of detecting a small object classes in comparison with the other benchmarked frameworks and presented a mean average precision (mAP) of 72.26% over the test dataset. Using YOLOv3 as the computer vision-enabling framework, a successful and efficient HRC industrial demonstrator was created.
publishDate	2021
dc.date.none.fl_str_mv	2021-12 2021-12-01T00:00:00Z 2022-05-25T16:12:46Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/masterThesis
format	masterThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/10362/138621
url	http://hdl.handle.net/10362/138621
dc.language.iso.fl_str_mv	eng
language	eng
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.source.none.fl_str_mv	reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP
instname_str	Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_	1799138091027398656

Autonomous trigger-task-sequencing system for Human- Robot Collaborative assembly: A Deep Learning approach on Visual Task Recognition

Registros relacionados