Ensembles de OCRs para aplicações médicas
Autor(a) principal: | |
---|---|
Data de Publicação: | 2021 |
Tipo de documento: | Dissertação |
Idioma: | por |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | https://hdl.handle.net/10216/137327 |
Resumo: | With the increasing use of new technologies in research, there is an enormous advantage in extracting data and knowledge stored in traditional media such as books and written records into databases. This transition is necessary because it facilitates all processes that involve the handling and processing of data on a large scale. One of the cases where this transition is necessary is the case of the "Child and Youth Health Bolentins", which is the case that this dissertation will focus on. These documents contain the information of users from birth to 20 years old. The information is stored in a table per document, and in the following pages the same information is shown in graphs. There is a great deal of information contained in these bulletins that is of interest to the scientific community to be transposed to digital media, so that it can be used in pediatric studies. What was aimed in the dissertation is to achieve an automated process through an Optical Character Recognition (OCR) system, associated with machine learning, data mining and also using Ensembles methods, in order to collect the data contained in the bulletins, obtaining the best possible predictive performance of the algorithms used. |
id |
RCAP_da4d4ea6f91619d3ab1c77a43b897892 |
---|---|
oai_identifier_str |
oai:repositorio-aberto.up.pt:10216/137327 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Ensembles de OCRs para aplicações médicasEngenharia electrotécnica, electrónica e informáticaElectrical engineering, Electronic engineering, Information engineeringWith the increasing use of new technologies in research, there is an enormous advantage in extracting data and knowledge stored in traditional media such as books and written records into databases. This transition is necessary because it facilitates all processes that involve the handling and processing of data on a large scale. One of the cases where this transition is necessary is the case of the "Child and Youth Health Bolentins", which is the case that this dissertation will focus on. These documents contain the information of users from birth to 20 years old. The information is stored in a table per document, and in the following pages the same information is shown in graphs. There is a great deal of information contained in these bulletins that is of interest to the scientific community to be transposed to digital media, so that it can be used in pediatric studies. What was aimed in the dissertation is to achieve an automated process through an Optical Character Recognition (OCR) system, associated with machine learning, data mining and also using Ensembles methods, in order to collect the data contained in the bulletins, obtaining the best possible predictive performance of the algorithms used.2021-10-142021-10-14T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://hdl.handle.net/10216/137327TID:202820912porJoão Adriano Portela de Matos Silvainfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-29T13:16:30Zoai:repositorio-aberto.up.pt:10216/137327Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T23:37:17.725843Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Ensembles de OCRs para aplicações médicas |
title |
Ensembles de OCRs para aplicações médicas |
spellingShingle |
Ensembles de OCRs para aplicações médicas João Adriano Portela de Matos Silva Engenharia electrotécnica, electrónica e informática Electrical engineering, Electronic engineering, Information engineering |
title_short |
Ensembles de OCRs para aplicações médicas |
title_full |
Ensembles de OCRs para aplicações médicas |
title_fullStr |
Ensembles de OCRs para aplicações médicas |
title_full_unstemmed |
Ensembles de OCRs para aplicações médicas |
title_sort |
Ensembles de OCRs para aplicações médicas |
author |
João Adriano Portela de Matos Silva |
author_facet |
João Adriano Portela de Matos Silva |
author_role |
author |
dc.contributor.author.fl_str_mv |
João Adriano Portela de Matos Silva |
dc.subject.por.fl_str_mv |
Engenharia electrotécnica, electrónica e informática Electrical engineering, Electronic engineering, Information engineering |
topic |
Engenharia electrotécnica, electrónica e informática Electrical engineering, Electronic engineering, Information engineering |
description |
With the increasing use of new technologies in research, there is an enormous advantage in extracting data and knowledge stored in traditional media such as books and written records into databases. This transition is necessary because it facilitates all processes that involve the handling and processing of data on a large scale. One of the cases where this transition is necessary is the case of the "Child and Youth Health Bolentins", which is the case that this dissertation will focus on. These documents contain the information of users from birth to 20 years old. The information is stored in a table per document, and in the following pages the same information is shown in graphs. There is a great deal of information contained in these bulletins that is of interest to the scientific community to be transposed to digital media, so that it can be used in pediatric studies. What was aimed in the dissertation is to achieve an automated process through an Optical Character Recognition (OCR) system, associated with machine learning, data mining and also using Ensembles methods, in order to collect the data contained in the bulletins, obtaining the best possible predictive performance of the algorithms used. |
publishDate |
2021 |
dc.date.none.fl_str_mv |
2021-10-14 2021-10-14T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://hdl.handle.net/10216/137327 TID:202820912 |
url |
https://hdl.handle.net/10216/137327 |
identifier_str_mv |
TID:202820912 |
dc.language.iso.fl_str_mv |
por |
language |
por |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799135685674795008 |