Ensembles de OCRs para aplicações médicas

Detalhes bibliográficos
Autor(a) principal: João Adriano Portela de Matos Silva
Data de Publicação: 2021
Tipo de documento: Dissertação
Idioma: por
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: https://hdl.handle.net/10216/137327
Resumo: With the increasing use of new technologies in research, there is an enormous advantage in extracting data and knowledge stored in traditional media such as books and written records into databases. This transition is necessary because it facilitates all processes that involve the handling and processing of data on a large scale. One of the cases where this transition is necessary is the case of the "Child and Youth Health Bolentins", which is the case that this dissertation will focus on. These documents contain the information of users from birth to 20 years old. The information is stored in a table per document, and in the following pages the same information is shown in graphs. There is a great deal of information contained in these bulletins that is of interest to the scientific community to be transposed to digital media, so that it can be used in pediatric studies. What was aimed in the dissertation is to achieve an automated process through an Optical Character Recognition (OCR) system, associated with machine learning, data mining and also using Ensembles methods, in order to collect the data contained in the bulletins, obtaining the best possible predictive performance of the algorithms used.
id RCAP_da4d4ea6f91619d3ab1c77a43b897892
oai_identifier_str oai:repositorio-aberto.up.pt:10216/137327
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Ensembles de OCRs para aplicações médicasEngenharia electrotécnica, electrónica e informáticaElectrical engineering, Electronic engineering, Information engineeringWith the increasing use of new technologies in research, there is an enormous advantage in extracting data and knowledge stored in traditional media such as books and written records into databases. This transition is necessary because it facilitates all processes that involve the handling and processing of data on a large scale. One of the cases where this transition is necessary is the case of the "Child and Youth Health Bolentins", which is the case that this dissertation will focus on. These documents contain the information of users from birth to 20 years old. The information is stored in a table per document, and in the following pages the same information is shown in graphs. There is a great deal of information contained in these bulletins that is of interest to the scientific community to be transposed to digital media, so that it can be used in pediatric studies. What was aimed in the dissertation is to achieve an automated process through an Optical Character Recognition (OCR) system, associated with machine learning, data mining and also using Ensembles methods, in order to collect the data contained in the bulletins, obtaining the best possible predictive performance of the algorithms used.2021-10-142021-10-14T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://hdl.handle.net/10216/137327TID:202820912porJoão Adriano Portela de Matos Silvainfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-29T13:16:30Zoai:repositorio-aberto.up.pt:10216/137327Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T23:37:17.725843Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Ensembles de OCRs para aplicações médicas
title Ensembles de OCRs para aplicações médicas
spellingShingle Ensembles de OCRs para aplicações médicas
João Adriano Portela de Matos Silva
Engenharia electrotécnica, electrónica e informática
Electrical engineering, Electronic engineering, Information engineering
title_short Ensembles de OCRs para aplicações médicas
title_full Ensembles de OCRs para aplicações médicas
title_fullStr Ensembles de OCRs para aplicações médicas
title_full_unstemmed Ensembles de OCRs para aplicações médicas
title_sort Ensembles de OCRs para aplicações médicas
author João Adriano Portela de Matos Silva
author_facet João Adriano Portela de Matos Silva
author_role author
dc.contributor.author.fl_str_mv João Adriano Portela de Matos Silva
dc.subject.por.fl_str_mv Engenharia electrotécnica, electrónica e informática
Electrical engineering, Electronic engineering, Information engineering
topic Engenharia electrotécnica, electrónica e informática
Electrical engineering, Electronic engineering, Information engineering
description With the increasing use of new technologies in research, there is an enormous advantage in extracting data and knowledge stored in traditional media such as books and written records into databases. This transition is necessary because it facilitates all processes that involve the handling and processing of data on a large scale. One of the cases where this transition is necessary is the case of the "Child and Youth Health Bolentins", which is the case that this dissertation will focus on. These documents contain the information of users from birth to 20 years old. The information is stored in a table per document, and in the following pages the same information is shown in graphs. There is a great deal of information contained in these bulletins that is of interest to the scientific community to be transposed to digital media, so that it can be used in pediatric studies. What was aimed in the dissertation is to achieve an automated process through an Optical Character Recognition (OCR) system, associated with machine learning, data mining and also using Ensembles methods, in order to collect the data contained in the bulletins, obtaining the best possible predictive performance of the algorithms used.
publishDate 2021
dc.date.none.fl_str_mv 2021-10-14
2021-10-14T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://hdl.handle.net/10216/137327
TID:202820912
url https://hdl.handle.net/10216/137327
identifier_str_mv TID:202820912
dc.language.iso.fl_str_mv por
language por
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799135685674795008