Representation learning for breast cancer lesion detection

Detalhes bibliográficos
Autor(a) principal: Raimundo, João Nuno Centeno
Data de Publicação: 2022
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10400.26/43020
Resumo: Breast Cancer (BC) is the second type of cancer with a higher incidence in women, it is responsible for the death of hundreds of thousands of women every year. However, when detected in the early stages of the disease, treatment methods have proven to be very effective in increasing life expectancy and, in many cases, patients fully recover. Several medical image modalities, such as MG – Mammography (X-Rays), US - Ultrasound, CT - Computer Tomography, MRI - Magnetic Resonance Imaging, and Tomosynthesis have been explored to support radiologists/physicians in clinical decision-making work- flows for the detection and diagnosis of BC. MG is the imaging modality more used at the worldwide level, however, recent research results have demonstrated that breast MRI is more sensitive than mam- mography to find pathological lesions, and it is not limited/affected by breast density issues. Therefore, it is currently a trend to introduce MRI-based breast assessment into clinical workflows (screening and diagnosis), but when compared to MG the workload of radiologists/physicians increases, MRI assess- ment is a more time-consuming task, and its effectiveness is affected not only by the variety of morpho- logical characteristics of each specific tumor phenotype and its origin but also by human fatigue. Com- puter-Aided Detection (CADe) methods have been widely explored primarily in mammography screen- ing tasks, but it remains an unsolved problem in breast MRI settings. This work aims to explore and validate BC detection models using Machine (Deep) Learning algorithms. As the main contribution, we have developed and validated an innovative method that improves the “breast MRI preprocessing phase” to select the patient’s image slices and bounding boxes representing pathological lesions. With this, it is possible to build a more robust training dataset to feed the deep learning models, reducing the computation time and the dimension of the dataset, and more importantly, to identify with high accuracy the specific regions (bounding boxes) for each of the patient images, in which a possible pathological lesion (tumor) has been identified. In experimental settings using a fully annotated (released for public domain) dataset comprising a total of 922 MRI-based BC patient cases, we have achieved, as the most accurate trained model, an accuracy rate of 97.83%, and subsequently, applying a ten-fold cross-validation method, a mean accuracy on the trained models of 94.46% and an associated standard deviation of 2.43%.
id RCAP_c5baf29d4a73cc0b5063adb1cc8de42d
oai_identifier_str oai:comum.rcaap.pt:10400.26/43020
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Representation learning for breast cancer lesion detectionBreast Cancer DetectionMagnetic Resonance ImagingComputer VisionMachine LearningDeep LearningConvolutional Neural NetworksBreast Cancer (BC) is the second type of cancer with a higher incidence in women, it is responsible for the death of hundreds of thousands of women every year. However, when detected in the early stages of the disease, treatment methods have proven to be very effective in increasing life expectancy and, in many cases, patients fully recover. Several medical image modalities, such as MG – Mammography (X-Rays), US - Ultrasound, CT - Computer Tomography, MRI - Magnetic Resonance Imaging, and Tomosynthesis have been explored to support radiologists/physicians in clinical decision-making work- flows for the detection and diagnosis of BC. MG is the imaging modality more used at the worldwide level, however, recent research results have demonstrated that breast MRI is more sensitive than mam- mography to find pathological lesions, and it is not limited/affected by breast density issues. Therefore, it is currently a trend to introduce MRI-based breast assessment into clinical workflows (screening and diagnosis), but when compared to MG the workload of radiologists/physicians increases, MRI assess- ment is a more time-consuming task, and its effectiveness is affected not only by the variety of morpho- logical characteristics of each specific tumor phenotype and its origin but also by human fatigue. Com- puter-Aided Detection (CADe) methods have been widely explored primarily in mammography screen- ing tasks, but it remains an unsolved problem in breast MRI settings. This work aims to explore and validate BC detection models using Machine (Deep) Learning algorithms. As the main contribution, we have developed and validated an innovative method that improves the “breast MRI preprocessing phase” to select the patient’s image slices and bounding boxes representing pathological lesions. With this, it is possible to build a more robust training dataset to feed the deep learning models, reducing the computation time and the dimension of the dataset, and more importantly, to identify with high accuracy the specific regions (bounding boxes) for each of the patient images, in which a possible pathological lesion (tumor) has been identified. In experimental settings using a fully annotated (released for public domain) dataset comprising a total of 922 MRI-based BC patient cases, we have achieved, as the most accurate trained model, an accuracy rate of 97.83%, and subsequently, applying a ten-fold cross-validation method, a mean accuracy on the trained models of 94.46% and an associated standard deviation of 2.43%.O cancro da mama (CdM) é o segundo tipo de cancro com maior incidência nas mulheres. É respon- sável pela morte de centenas de milhares de mulheres todos os anos. Contudo, quando detetado nas fases iniciais da doença, os métodos de tratamento provaram ser muito eficazes aumentando a espe- rança de vida e, em muitos casos, os pacientes recuperam totalmente. Têm sido exploradas várias modalidades de imagem médica, tais como MG - Mamografia (Raios-X), US - Ultra-som, CT - Tomo- grafia Computadorizada, MRI - Ressonância Magnética e Tomossíntese, para apoiar radiologistas nos fluxos de trabalho clínico para a deteção e diagnóstico do CdM. A MG é a modalidade de imagem mais utilizada a nível mundial, contudo, resultados de pesquisas recentes demonstraram que o MRI é mais sensível do que a mamografia para encontrar lesões patológicas e, também, não é limitada ou afetada por questões de densidade mamária. Consequentemente, atualmente é uma tendência introduzir a avaliação mamográfica baseada em MRI nos fluxos de trabalho clínico - rastreio e diagnóstico -, mas quando comparada com a MG, a carga de trabalho dos radiologistas aumenta. A avaliação por MRI é uma tarefa mais demorada, e a sua eficácia é afetada não só pela variedade de características morfo- lógicas e origem de cada fenótipo tumoral específico, mas, também pela fadiga humana. Os métodos de deteção assistida por computador (CADe) têm sido amplamente explorados principalmente em ta- refas de rastreio mamográfico, mas continua a ser um problema por resolver em ambientes de resso- nância magnética mamária. Este trabalho visa explorar e validar modelos de deteção de CdM usando algoritmos de Machine (Deep) Learning. Como contributo principal, desenvolvemos e validámos um método inovador que me- lhora a "fase de pré-processamento das imagens de ressonância magnética mamária" para selecionar as fatias de imagem do paciente e as respetivas caixas de contorno que representam as lesões pato- lógicas. Com isto, é possível construir um conjunto de dados de treino mais robusto para alimentar os modelos de deep learning, reduzir o tempo de computação, reduzir a dimensão do conjunto de dados e, mais importante, para identificar com alta precisão as regiões específicas para cada uma das ima- gens do paciente nas quais foi identificada uma possível lesão patológica (tumor). Os resultados expe- rimentais, num conjunto de imagens de ressonância magnética de domínio público totalmente anotado com 922 casos de doentes com CdM, mostram no melhor modelo uma taxa de exatidão de 97.83%. Foi aplicado um método de validação cruzada de 10 folds do qual resultou uma exatidão média de 94,46% com um desvio padrão de 2,43% nos modelos treinados.Guevara Lopez, MiguelRepositório ComumRaimundo, João Nuno Centeno2023-01-06T14:52:05Z2022-122022-12-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10400.26/43020TID:203231260enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-21T09:57:21Zoai:comum.rcaap.pt:10400.26/43020Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T23:12:44.394563Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Representation learning for breast cancer lesion detection
title Representation learning for breast cancer lesion detection
spellingShingle Representation learning for breast cancer lesion detection
Raimundo, João Nuno Centeno
Breast Cancer Detection
Magnetic Resonance Imaging
Computer Vision
Machine Learning
Deep Learning
Convolutional Neural Networks
title_short Representation learning for breast cancer lesion detection
title_full Representation learning for breast cancer lesion detection
title_fullStr Representation learning for breast cancer lesion detection
title_full_unstemmed Representation learning for breast cancer lesion detection
title_sort Representation learning for breast cancer lesion detection
author Raimundo, João Nuno Centeno
author_facet Raimundo, João Nuno Centeno
author_role author
dc.contributor.none.fl_str_mv Guevara Lopez, Miguel
Repositório Comum
dc.contributor.author.fl_str_mv Raimundo, João Nuno Centeno
dc.subject.por.fl_str_mv Breast Cancer Detection
Magnetic Resonance Imaging
Computer Vision
Machine Learning
Deep Learning
Convolutional Neural Networks
topic Breast Cancer Detection
Magnetic Resonance Imaging
Computer Vision
Machine Learning
Deep Learning
Convolutional Neural Networks
description Breast Cancer (BC) is the second type of cancer with a higher incidence in women, it is responsible for the death of hundreds of thousands of women every year. However, when detected in the early stages of the disease, treatment methods have proven to be very effective in increasing life expectancy and, in many cases, patients fully recover. Several medical image modalities, such as MG – Mammography (X-Rays), US - Ultrasound, CT - Computer Tomography, MRI - Magnetic Resonance Imaging, and Tomosynthesis have been explored to support radiologists/physicians in clinical decision-making work- flows for the detection and diagnosis of BC. MG is the imaging modality more used at the worldwide level, however, recent research results have demonstrated that breast MRI is more sensitive than mam- mography to find pathological lesions, and it is not limited/affected by breast density issues. Therefore, it is currently a trend to introduce MRI-based breast assessment into clinical workflows (screening and diagnosis), but when compared to MG the workload of radiologists/physicians increases, MRI assess- ment is a more time-consuming task, and its effectiveness is affected not only by the variety of morpho- logical characteristics of each specific tumor phenotype and its origin but also by human fatigue. Com- puter-Aided Detection (CADe) methods have been widely explored primarily in mammography screen- ing tasks, but it remains an unsolved problem in breast MRI settings. This work aims to explore and validate BC detection models using Machine (Deep) Learning algorithms. As the main contribution, we have developed and validated an innovative method that improves the “breast MRI preprocessing phase” to select the patient’s image slices and bounding boxes representing pathological lesions. With this, it is possible to build a more robust training dataset to feed the deep learning models, reducing the computation time and the dimension of the dataset, and more importantly, to identify with high accuracy the specific regions (bounding boxes) for each of the patient images, in which a possible pathological lesion (tumor) has been identified. In experimental settings using a fully annotated (released for public domain) dataset comprising a total of 922 MRI-based BC patient cases, we have achieved, as the most accurate trained model, an accuracy rate of 97.83%, and subsequently, applying a ten-fold cross-validation method, a mean accuracy on the trained models of 94.46% and an associated standard deviation of 2.43%.
publishDate 2022
dc.date.none.fl_str_mv 2022-12
2022-12-01T00:00:00Z
2023-01-06T14:52:05Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10400.26/43020
TID:203231260
url http://hdl.handle.net/10400.26/43020
identifier_str_mv TID:203231260
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799135400500920320