VCE dataset generation: active learning solutions for binary classification in informative vs uninformative frames
Autor(a) principal: | |
---|---|
Data de Publicação: | 2023 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10773/39930 |
Resumo: | Video Capsule Endoscopy is a non-invasive image technique that allows the observation of the small bowel. However, it requires review and Annotation of up to 8 to 10 hours of videos that need to be reviewed by a medical expert, which is very time-consuming. State-of-the-art Machine Learning methods now have the power to assist experts by automatically classifying findings in the video frames, but big Video Capsule Endoscopy annotated datasets are needed, which requires an unaffordable effort. Active Learning methodologies can be used to optimize dataset annotation through the intelligent identification of the samples to be annotated in big non-annotated datasets that most contribute to model learning. In this dissertation, a study of Active Learning to create VCE datasets, in order to solve a binary problem related to the classification between informative and uninformative frames, was made. We explored some Active Learning techniques, such as Least Confidence Sampling and Margin Sampling, to conclude about the annotation effort and the capability to rapidly create representative datasets. It was verified that Least Confidence Sampling was the more appropriate technique for our data, given the accuracy when dividing unseen video frames into informative and uninformative; and that Active Learning has the potential to expand the existing datasets using less data and human effort. |
id |
RCAP_5fe0678f9d985fbe7bb9012bbd9e1620 |
---|---|
oai_identifier_str |
oai:ria.ua.pt:10773/39930 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
VCE dataset generation: active learning solutions for binary classification in informative vs uninformative framesVCEActive learningDataset creationInformative imagesVideo Capsule Endoscopy is a non-invasive image technique that allows the observation of the small bowel. However, it requires review and Annotation of up to 8 to 10 hours of videos that need to be reviewed by a medical expert, which is very time-consuming. State-of-the-art Machine Learning methods now have the power to assist experts by automatically classifying findings in the video frames, but big Video Capsule Endoscopy annotated datasets are needed, which requires an unaffordable effort. Active Learning methodologies can be used to optimize dataset annotation through the intelligent identification of the samples to be annotated in big non-annotated datasets that most contribute to model learning. In this dissertation, a study of Active Learning to create VCE datasets, in order to solve a binary problem related to the classification between informative and uninformative frames, was made. We explored some Active Learning techniques, such as Least Confidence Sampling and Margin Sampling, to conclude about the annotation effort and the capability to rapidly create representative datasets. It was verified that Least Confidence Sampling was the more appropriate technique for our data, given the accuracy when dividing unseen video frames into informative and uninformative; and that Active Learning has the potential to expand the existing datasets using less data and human effort.A Cápsula Endoscópica é uma técnica de imagem não invasiva que permite a observação do intestino delgado. No entanto, requer revisão e anotação de vídeos de duração entre 8 a 10 horas, que necessitam de ser revistos por um profissional de saúde, o que torna esta tarefa demorada. Métodos de Machine Learning atuais já conseguem assistir os profissionais através da classificação automática de descobertas nas imagens, no entanto, para atingir este estado grandes datasets de vídeos de Cápsula Endoscópica são necessários, o que requer uma quantidade de esforço insustentável. Métodos de Active Learning podem ser usados para otimizar a anotação através da identificação inteligente de imagens para serem anotadas, num grande dataset não anotado, que vão contribuir para a aprendizagem do modelo. Nesta dissertação, um estudo de Active Learning para a criação de datasets de VCE para resolver problemas binários relacionados com a classificação de imagens em informativas e não informativas, foi realizado. Algumas técnicas de Active Learning foram exploradas, tais como Least Confidence Sampling e Margin Sampling, para se concluir sobre o esforço de anotação e a rápida criação de datasets representativos. Foi verificado que o Least Confidence Sampling foi o método que melhor se adaptou aos nossos dados, dada a precisão obtida ao dividir imagens nunca vistas pelo modelo, em informativas e não informativas; e que o Active Learning tem o potencial para expandir os datasets utilizando menos dados e menos esforço humano.2024-01-03T14:41:58Z2023-11-27T00:00:00Z2023-11-27info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10773/39930engNunes, Beatriz Gramatainfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-02-22T12:18:10Zoai:ria.ua.pt:10773/39930Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:10:03.170234Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
VCE dataset generation: active learning solutions for binary classification in informative vs uninformative frames |
title |
VCE dataset generation: active learning solutions for binary classification in informative vs uninformative frames |
spellingShingle |
VCE dataset generation: active learning solutions for binary classification in informative vs uninformative frames Nunes, Beatriz Gramata VCE Active learning Dataset creation Informative images |
title_short |
VCE dataset generation: active learning solutions for binary classification in informative vs uninformative frames |
title_full |
VCE dataset generation: active learning solutions for binary classification in informative vs uninformative frames |
title_fullStr |
VCE dataset generation: active learning solutions for binary classification in informative vs uninformative frames |
title_full_unstemmed |
VCE dataset generation: active learning solutions for binary classification in informative vs uninformative frames |
title_sort |
VCE dataset generation: active learning solutions for binary classification in informative vs uninformative frames |
author |
Nunes, Beatriz Gramata |
author_facet |
Nunes, Beatriz Gramata |
author_role |
author |
dc.contributor.author.fl_str_mv |
Nunes, Beatriz Gramata |
dc.subject.por.fl_str_mv |
VCE Active learning Dataset creation Informative images |
topic |
VCE Active learning Dataset creation Informative images |
description |
Video Capsule Endoscopy is a non-invasive image technique that allows the observation of the small bowel. However, it requires review and Annotation of up to 8 to 10 hours of videos that need to be reviewed by a medical expert, which is very time-consuming. State-of-the-art Machine Learning methods now have the power to assist experts by automatically classifying findings in the video frames, but big Video Capsule Endoscopy annotated datasets are needed, which requires an unaffordable effort. Active Learning methodologies can be used to optimize dataset annotation through the intelligent identification of the samples to be annotated in big non-annotated datasets that most contribute to model learning. In this dissertation, a study of Active Learning to create VCE datasets, in order to solve a binary problem related to the classification between informative and uninformative frames, was made. We explored some Active Learning techniques, such as Least Confidence Sampling and Margin Sampling, to conclude about the annotation effort and the capability to rapidly create representative datasets. It was verified that Least Confidence Sampling was the more appropriate technique for our data, given the accuracy when dividing unseen video frames into informative and uninformative; and that Active Learning has the potential to expand the existing datasets using less data and human effort. |
publishDate |
2023 |
dc.date.none.fl_str_mv |
2023-11-27T00:00:00Z 2023-11-27 2024-01-03T14:41:58Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10773/39930 |
url |
http://hdl.handle.net/10773/39930 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799137751048650752 |