VCE dataset generation: active learning solutions for binary classification in informative vs uninformative frames

Nunes, Beatriz Gramata

VCE dataset generation: active learning solutions for binary classification in informative vs uninformative frames

Detalhes bibliográficos
Autor(a) principal:	Nunes, Beatriz Gramata
Data de Publicação:	2023
Tipo de documento:	Dissertação
Idioma:	eng
Título da fonte:	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo:	http://hdl.handle.net/10773/39930
Resumo:	Video Capsule Endoscopy is a non-invasive image technique that allows the observation of the small bowel. However, it requires review and Annotation of up to 8 to 10 hours of videos that need to be reviewed by a medical expert, which is very time-consuming. State-of-the-art Machine Learning methods now have the power to assist experts by automatically classifying findings in the video frames, but big Video Capsule Endoscopy annotated datasets are needed, which requires an unaffordable effort. Active Learning methodologies can be used to optimize dataset annotation through the intelligent identification of the samples to be annotated in big non-annotated datasets that most contribute to model learning. In this dissertation, a study of Active Learning to create VCE datasets, in order to solve a binary problem related to the classification between informative and uninformative frames, was made. We explored some Active Learning techniques, such as Least Confidence Sampling and Margin Sampling, to conclude about the annotation effort and the capability to rapidly create representative datasets. It was verified that Least Confidence Sampling was the more appropriate technique for our data, given the accuracy when dividing unseen video frames into informative and uninformative; and that Active Learning has the potential to expand the existing datasets using less data and human effort.

Metadados do item

id	RCAP_5fe0678f9d985fbe7bb9012bbd9e1620
oai_identifier_str	oai:ria.ua.pt:10773/39930
network_acronym_str	RCAP
network_name_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str	7160
spelling	VCE dataset generation: active learning solutions for binary classification in informative vs uninformative framesVCEActive learningDataset creationInformative imagesVideo Capsule Endoscopy is a non-invasive image technique that allows the observation of the small bowel. However, it requires review and Annotation of up to 8 to 10 hours of videos that need to be reviewed by a medical expert, which is very time-consuming. State-of-the-art Machine Learning methods now have the power to assist experts by automatically classifying findings in the video frames, but big Video Capsule Endoscopy annotated datasets are needed, which requires an unaffordable effort. Active Learning methodologies can be used to optimize dataset annotation through the intelligent identification of the samples to be annotated in big non-annotated datasets that most contribute to model learning. In this dissertation, a study of Active Learning to create VCE datasets, in order to solve a binary problem related to the classification between informative and uninformative frames, was made. We explored some Active Learning techniques, such as Least Confidence Sampling and Margin Sampling, to conclude about the annotation effort and the capability to rapidly create representative datasets. It was verified that Least Confidence Sampling was the more appropriate technique for our data, given the accuracy when dividing unseen video frames into informative and uninformative; and that Active Learning has the potential to expand the existing datasets using less data and human effort.A Cápsula Endoscópica é uma técnica de imagem não invasiva que permite a observação do intestino delgado. No entanto, requer revisão e anotação de vídeos de duração entre 8 a 10 horas, que necessitam de ser revistos por um profissional de saúde, o que torna esta tarefa demorada. Métodos de Machine Learning atuais já conseguem assistir os profissionais através da classificação automática de descobertas nas imagens, no entanto, para atingir este estado grandes datasets de vídeos de Cápsula Endoscópica são necessários, o que requer uma quantidade de esforço insustentável. Métodos de Active Learning podem ser usados para otimizar a anotação através da identificação inteligente de imagens para serem anotadas, num grande dataset não anotado, que vão contribuir para a aprendizagem do modelo. Nesta dissertação, um estudo de Active Learning para a criação de datasets de VCE para resolver problemas binários relacionados com a classificação de imagens em informativas e não informativas, foi realizado. Algumas técnicas de Active Learning foram exploradas, tais como Least Confidence Sampling e Margin Sampling, para se concluir sobre o esforço de anotação e a rápida criação de datasets representativos. Foi verificado que o Least Confidence Sampling foi o método que melhor se adaptou aos nossos dados, dada a precisão obtida ao dividir imagens nunca vistas pelo modelo, em informativas e não informativas; e que o Active Learning tem o potencial para expandir os datasets utilizando menos dados e menos esforço humano.2024-01-03T14:41:58Z2023-11-27T00:00:00Z2023-11-27info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10773/39930engNunes, Beatriz Gramatainfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-02-22T12:18:10Zoai:ria.ua.pt:10773/39930Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:10:03.170234Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv	VCE dataset generation: active learning solutions for binary classification in informative vs uninformative frames
title	VCE dataset generation: active learning solutions for binary classification in informative vs uninformative frames
spellingShingle	VCE dataset generation: active learning solutions for binary classification in informative vs uninformative frames Nunes, Beatriz Gramata VCE Active learning Dataset creation Informative images
title_short	VCE dataset generation: active learning solutions for binary classification in informative vs uninformative frames
title_full	VCE dataset generation: active learning solutions for binary classification in informative vs uninformative frames
title_fullStr	VCE dataset generation: active learning solutions for binary classification in informative vs uninformative frames
title_full_unstemmed	VCE dataset generation: active learning solutions for binary classification in informative vs uninformative frames
title_sort	VCE dataset generation: active learning solutions for binary classification in informative vs uninformative frames
author	Nunes, Beatriz Gramata
author_facet	Nunes, Beatriz Gramata
author_role	author
dc.contributor.author.fl_str_mv	Nunes, Beatriz Gramata
dc.subject.por.fl_str_mv	VCE Active learning Dataset creation Informative images
topic	VCE Active learning Dataset creation Informative images
description	Video Capsule Endoscopy is a non-invasive image technique that allows the observation of the small bowel. However, it requires review and Annotation of up to 8 to 10 hours of videos that need to be reviewed by a medical expert, which is very time-consuming. State-of-the-art Machine Learning methods now have the power to assist experts by automatically classifying findings in the video frames, but big Video Capsule Endoscopy annotated datasets are needed, which requires an unaffordable effort. Active Learning methodologies can be used to optimize dataset annotation through the intelligent identification of the samples to be annotated in big non-annotated datasets that most contribute to model learning. In this dissertation, a study of Active Learning to create VCE datasets, in order to solve a binary problem related to the classification between informative and uninformative frames, was made. We explored some Active Learning techniques, such as Least Confidence Sampling and Margin Sampling, to conclude about the annotation effort and the capability to rapidly create representative datasets. It was verified that Least Confidence Sampling was the more appropriate technique for our data, given the accuracy when dividing unseen video frames into informative and uninformative; and that Active Learning has the potential to expand the existing datasets using less data and human effort.
publishDate	2023
dc.date.none.fl_str_mv	2023-11-27T00:00:00Z 2023-11-27 2024-01-03T14:41:58Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/masterThesis
format	masterThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/10773/39930
url	http://hdl.handle.net/10773/39930
dc.language.iso.fl_str_mv	eng
language	eng
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.source.none.fl_str_mv	reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP
instname_str	Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_	1799137751048650752

VCE dataset generation: active learning solutions for binary classification in informative vs uninformative frames

Registros relacionados