Representation learning for breast cancer lesion detection
Autor(a) principal: | |
---|---|
Data de Publicação: | 2022 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10400.26/43020 |
Resumo: | Breast Cancer (BC) is the second type of cancer with a higher incidence in women, it is responsible for the death of hundreds of thousands of women every year. However, when detected in the early stages of the disease, treatment methods have proven to be very effective in increasing life expectancy and, in many cases, patients fully recover. Several medical image modalities, such as MG – Mammography (X-Rays), US - Ultrasound, CT - Computer Tomography, MRI - Magnetic Resonance Imaging, and Tomosynthesis have been explored to support radiologists/physicians in clinical decision-making work- flows for the detection and diagnosis of BC. MG is the imaging modality more used at the worldwide level, however, recent research results have demonstrated that breast MRI is more sensitive than mam- mography to find pathological lesions, and it is not limited/affected by breast density issues. Therefore, it is currently a trend to introduce MRI-based breast assessment into clinical workflows (screening and diagnosis), but when compared to MG the workload of radiologists/physicians increases, MRI assess- ment is a more time-consuming task, and its effectiveness is affected not only by the variety of morpho- logical characteristics of each specific tumor phenotype and its origin but also by human fatigue. Com- puter-Aided Detection (CADe) methods have been widely explored primarily in mammography screen- ing tasks, but it remains an unsolved problem in breast MRI settings. This work aims to explore and validate BC detection models using Machine (Deep) Learning algorithms. As the main contribution, we have developed and validated an innovative method that improves the “breast MRI preprocessing phase” to select the patient’s image slices and bounding boxes representing pathological lesions. With this, it is possible to build a more robust training dataset to feed the deep learning models, reducing the computation time and the dimension of the dataset, and more importantly, to identify with high accuracy the specific regions (bounding boxes) for each of the patient images, in which a possible pathological lesion (tumor) has been identified. In experimental settings using a fully annotated (released for public domain) dataset comprising a total of 922 MRI-based BC patient cases, we have achieved, as the most accurate trained model, an accuracy rate of 97.83%, and subsequently, applying a ten-fold cross-validation method, a mean accuracy on the trained models of 94.46% and an associated standard deviation of 2.43%. |
id |
RCAP_c5baf29d4a73cc0b5063adb1cc8de42d |
---|---|
oai_identifier_str |
oai:comum.rcaap.pt:10400.26/43020 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Representation learning for breast cancer lesion detectionBreast Cancer DetectionMagnetic Resonance ImagingComputer VisionMachine LearningDeep LearningConvolutional Neural NetworksBreast Cancer (BC) is the second type of cancer with a higher incidence in women, it is responsible for the death of hundreds of thousands of women every year. However, when detected in the early stages of the disease, treatment methods have proven to be very effective in increasing life expectancy and, in many cases, patients fully recover. Several medical image modalities, such as MG – Mammography (X-Rays), US - Ultrasound, CT - Computer Tomography, MRI - Magnetic Resonance Imaging, and Tomosynthesis have been explored to support radiologists/physicians in clinical decision-making work- flows for the detection and diagnosis of BC. MG is the imaging modality more used at the worldwide level, however, recent research results have demonstrated that breast MRI is more sensitive than mam- mography to find pathological lesions, and it is not limited/affected by breast density issues. Therefore, it is currently a trend to introduce MRI-based breast assessment into clinical workflows (screening and diagnosis), but when compared to MG the workload of radiologists/physicians increases, MRI assess- ment is a more time-consuming task, and its effectiveness is affected not only by the variety of morpho- logical characteristics of each specific tumor phenotype and its origin but also by human fatigue. Com- puter-Aided Detection (CADe) methods have been widely explored primarily in mammography screen- ing tasks, but it remains an unsolved problem in breast MRI settings. This work aims to explore and validate BC detection models using Machine (Deep) Learning algorithms. As the main contribution, we have developed and validated an innovative method that improves the “breast MRI preprocessing phase” to select the patient’s image slices and bounding boxes representing pathological lesions. With this, it is possible to build a more robust training dataset to feed the deep learning models, reducing the computation time and the dimension of the dataset, and more importantly, to identify with high accuracy the specific regions (bounding boxes) for each of the patient images, in which a possible pathological lesion (tumor) has been identified. In experimental settings using a fully annotated (released for public domain) dataset comprising a total of 922 MRI-based BC patient cases, we have achieved, as the most accurate trained model, an accuracy rate of 97.83%, and subsequently, applying a ten-fold cross-validation method, a mean accuracy on the trained models of 94.46% and an associated standard deviation of 2.43%.O cancro da mama (CdM) é o segundo tipo de cancro com maior incidência nas mulheres. É respon- sável pela morte de centenas de milhares de mulheres todos os anos. Contudo, quando detetado nas fases iniciais da doença, os métodos de tratamento provaram ser muito eficazes aumentando a espe- rança de vida e, em muitos casos, os pacientes recuperam totalmente. Têm sido exploradas várias modalidades de imagem médica, tais como MG - Mamografia (Raios-X), US - Ultra-som, CT - Tomo- grafia Computadorizada, MRI - Ressonância Magnética e Tomossíntese, para apoiar radiologistas nos fluxos de trabalho clínico para a deteção e diagnóstico do CdM. A MG é a modalidade de imagem mais utilizada a nível mundial, contudo, resultados de pesquisas recentes demonstraram que o MRI é mais sensível do que a mamografia para encontrar lesões patológicas e, também, não é limitada ou afetada por questões de densidade mamária. Consequentemente, atualmente é uma tendência introduzir a avaliação mamográfica baseada em MRI nos fluxos de trabalho clínico - rastreio e diagnóstico -, mas quando comparada com a MG, a carga de trabalho dos radiologistas aumenta. A avaliação por MRI é uma tarefa mais demorada, e a sua eficácia é afetada não só pela variedade de características morfo- lógicas e origem de cada fenótipo tumoral específico, mas, também pela fadiga humana. Os métodos de deteção assistida por computador (CADe) têm sido amplamente explorados principalmente em ta- refas de rastreio mamográfico, mas continua a ser um problema por resolver em ambientes de resso- nância magnética mamária. Este trabalho visa explorar e validar modelos de deteção de CdM usando algoritmos de Machine (Deep) Learning. Como contributo principal, desenvolvemos e validámos um método inovador que me- lhora a "fase de pré-processamento das imagens de ressonância magnética mamária" para selecionar as fatias de imagem do paciente e as respetivas caixas de contorno que representam as lesões pato- lógicas. Com isto, é possível construir um conjunto de dados de treino mais robusto para alimentar os modelos de deep learning, reduzir o tempo de computação, reduzir a dimensão do conjunto de dados e, mais importante, para identificar com alta precisão as regiões específicas para cada uma das ima- gens do paciente nas quais foi identificada uma possível lesão patológica (tumor). Os resultados expe- rimentais, num conjunto de imagens de ressonância magnética de domínio público totalmente anotado com 922 casos de doentes com CdM, mostram no melhor modelo uma taxa de exatidão de 97.83%. Foi aplicado um método de validação cruzada de 10 folds do qual resultou uma exatidão média de 94,46% com um desvio padrão de 2,43% nos modelos treinados.Guevara Lopez, MiguelRepositório ComumRaimundo, João Nuno Centeno2023-01-06T14:52:05Z2022-122022-12-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10400.26/43020TID:203231260enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-21T09:57:21Zoai:comum.rcaap.pt:10400.26/43020Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T23:12:44.394563Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Representation learning for breast cancer lesion detection |
title |
Representation learning for breast cancer lesion detection |
spellingShingle |
Representation learning for breast cancer lesion detection Raimundo, João Nuno Centeno Breast Cancer Detection Magnetic Resonance Imaging Computer Vision Machine Learning Deep Learning Convolutional Neural Networks |
title_short |
Representation learning for breast cancer lesion detection |
title_full |
Representation learning for breast cancer lesion detection |
title_fullStr |
Representation learning for breast cancer lesion detection |
title_full_unstemmed |
Representation learning for breast cancer lesion detection |
title_sort |
Representation learning for breast cancer lesion detection |
author |
Raimundo, João Nuno Centeno |
author_facet |
Raimundo, João Nuno Centeno |
author_role |
author |
dc.contributor.none.fl_str_mv |
Guevara Lopez, Miguel Repositório Comum |
dc.contributor.author.fl_str_mv |
Raimundo, João Nuno Centeno |
dc.subject.por.fl_str_mv |
Breast Cancer Detection Magnetic Resonance Imaging Computer Vision Machine Learning Deep Learning Convolutional Neural Networks |
topic |
Breast Cancer Detection Magnetic Resonance Imaging Computer Vision Machine Learning Deep Learning Convolutional Neural Networks |
description |
Breast Cancer (BC) is the second type of cancer with a higher incidence in women, it is responsible for the death of hundreds of thousands of women every year. However, when detected in the early stages of the disease, treatment methods have proven to be very effective in increasing life expectancy and, in many cases, patients fully recover. Several medical image modalities, such as MG – Mammography (X-Rays), US - Ultrasound, CT - Computer Tomography, MRI - Magnetic Resonance Imaging, and Tomosynthesis have been explored to support radiologists/physicians in clinical decision-making work- flows for the detection and diagnosis of BC. MG is the imaging modality more used at the worldwide level, however, recent research results have demonstrated that breast MRI is more sensitive than mam- mography to find pathological lesions, and it is not limited/affected by breast density issues. Therefore, it is currently a trend to introduce MRI-based breast assessment into clinical workflows (screening and diagnosis), but when compared to MG the workload of radiologists/physicians increases, MRI assess- ment is a more time-consuming task, and its effectiveness is affected not only by the variety of morpho- logical characteristics of each specific tumor phenotype and its origin but also by human fatigue. Com- puter-Aided Detection (CADe) methods have been widely explored primarily in mammography screen- ing tasks, but it remains an unsolved problem in breast MRI settings. This work aims to explore and validate BC detection models using Machine (Deep) Learning algorithms. As the main contribution, we have developed and validated an innovative method that improves the “breast MRI preprocessing phase” to select the patient’s image slices and bounding boxes representing pathological lesions. With this, it is possible to build a more robust training dataset to feed the deep learning models, reducing the computation time and the dimension of the dataset, and more importantly, to identify with high accuracy the specific regions (bounding boxes) for each of the patient images, in which a possible pathological lesion (tumor) has been identified. In experimental settings using a fully annotated (released for public domain) dataset comprising a total of 922 MRI-based BC patient cases, we have achieved, as the most accurate trained model, an accuracy rate of 97.83%, and subsequently, applying a ten-fold cross-validation method, a mean accuracy on the trained models of 94.46% and an associated standard deviation of 2.43%. |
publishDate |
2022 |
dc.date.none.fl_str_mv |
2022-12 2022-12-01T00:00:00Z 2023-01-06T14:52:05Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10400.26/43020 TID:203231260 |
url |
http://hdl.handle.net/10400.26/43020 |
identifier_str_mv |
TID:203231260 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799135400500920320 |