Machine learning models in decision support systems for diagnosing colorectal cancer based on metabolic profiles

Detalhes bibliográficos
Autor(a) principal: Barbosa, Rui Xavier Ferreira
Data de Publicação: 2023
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10400.22/24307
Resumo: In today’s ever-evolving technological landscape, the volume of data across sectors is grow ing, particularly in healthcare. Here, the gathering and processing of biochemical data aim to refine decision-making for patient treatments, especially using tools based on Machine Learning (ML). As a subset of Artificial Intelligence, ML harnesses algorithms to predict outcomes or unearth patterns that might otherwise remain concealed. The interpretability of ML models is pivotal, enabling healthcare professionals to place con fidence in and decipher the model’s predictions. This assumes particular significance when decisions could directly affect patient lives. This research embarked on an in-depth exploration of various ML algorithms and techniques to discern whether the combined metabolic profiles of amino acids and acylcarnitines might serve as new biochemical indicators for predicting colo-rectal cancer prognosis. Throughout this study, several algorithms and data preprocessing techniques were evaluated. Four distinct experiments validated the predictions of the models in different scenarios. These scenarios involved predicting Colorectal Cancer using amino acids with and without the age parameter, and similarly, using acylcarnitine with and without the age parameter. Each scenario’s predictions were elucidated using SHAP, both for overarching feature significance and individual instances. Preliminary analyses indicated that the constructed models demonstrated promising predic tive power, with notable variations for the different scenarios. Amongst the algorithms tested, Random Forest, Support Vector Machine, Gaussian Naive Bayes, and Gradient Boosting emerged as the top performers.
id RCAP_3ad4a273e6604371b93ad3ab39d011aa
oai_identifier_str oai:recipp.ipp.pt:10400.22/24307
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Machine learning models in decision support systems for diagnosing colorectal cancer based on metabolic profilesColorectal CancerMachine LearningAmino AcidsAcylcarnitinesExplainableAIDomínio/Área Científica::Engenharia e TecnologiaIn today’s ever-evolving technological landscape, the volume of data across sectors is grow ing, particularly in healthcare. Here, the gathering and processing of biochemical data aim to refine decision-making for patient treatments, especially using tools based on Machine Learning (ML). As a subset of Artificial Intelligence, ML harnesses algorithms to predict outcomes or unearth patterns that might otherwise remain concealed. The interpretability of ML models is pivotal, enabling healthcare professionals to place con fidence in and decipher the model’s predictions. This assumes particular significance when decisions could directly affect patient lives. This research embarked on an in-depth exploration of various ML algorithms and techniques to discern whether the combined metabolic profiles of amino acids and acylcarnitines might serve as new biochemical indicators for predicting colo-rectal cancer prognosis. Throughout this study, several algorithms and data preprocessing techniques were evaluated. Four distinct experiments validated the predictions of the models in different scenarios. These scenarios involved predicting Colorectal Cancer using amino acids with and without the age parameter, and similarly, using acylcarnitine with and without the age parameter. Each scenario’s predictions were elucidated using SHAP, both for overarching feature significance and individual instances. Preliminary analyses indicated that the constructed models demonstrated promising predic tive power, with notable variations for the different scenarios. Amongst the algorithms tested, Random Forest, Support Vector Machine, Gaussian Naive Bayes, and Gradient Boosting emerged as the top performers.No atual panorama tecnológico em constante evolução, o volume de dados em diversos setores está a aumentar, particularmente na saúde. Aqui, a recolha e processamento de dados bioquímicos visam aprimorar a tomada de decisão para tratamentos de pacientes, especialmente utilizando ferramentas baseadas em Aprendizagem Automática. Como um subconjunto da Inteligência Artificial, a Aprendizagem Automática utiliza algoritmos para prever resultados ou descobrir padrões que de outra forma poderiam permanecer ocultos. A interpretabilidade dos modelos de Aprendizagem Automática é fundamental, permitindo que os profissionais de saúde confiem e decifrem as previsões do modelo. Isto assume uma importância particular quando as decisões podem afetar diretamente a vida dos pacientes. Esta investigação levou a cabo uma exploração aprofundada de vários algoritmos e téc nicas de Aprendizagem Automática para determinar se os perfis metabólicos combinados de aminoácidos e acilcarnitinas poderiam servir como novos indicadores bioquímicos para a previsão e prognóstico do cancro colo-retal. Ao longo deste estudo, vários algoritmos e técnicas de pré-processamento de dados foram avaliados. Quatro experiências distintas validaram as previsões dos modelos em diferentes cenários. Estes cenários envolveram a previsão de Cancro Colorretal usando aminoácidos com e sem o atributo idade, e de forma semelhante, usando acilcarnitinas. As previsões de cada cenário foram elucidadas usando o SHAP, tanto para a importância geral dos atributos como para amostras individuais. Análises preliminares indicaram que os modelos construídos mostraram um poder preditivo promissor, com variações notáveis nos diferentes cenários. Entre os algoritmos testados, Random Forest, Support Vector Machines, Naive Bayes e Gradient Boosting destacaram-se com melhor desempenho.Tavares, José Antonio ReisRepositório Científico do Instituto Politécnico do PortoBarbosa, Rui Xavier Ferreira2023-12-20T09:07:03Z2023-11-172023-11-17T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10400.22/24307TID:203414543enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-12-27T01:49:07Zoai:recipp.ipp.pt:10400.22/24307Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T00:56:13.697212Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Machine learning models in decision support systems for diagnosing colorectal cancer based on metabolic profiles
title Machine learning models in decision support systems for diagnosing colorectal cancer based on metabolic profiles
spellingShingle Machine learning models in decision support systems for diagnosing colorectal cancer based on metabolic profiles
Barbosa, Rui Xavier Ferreira
Colorectal Cancer
Machine Learning
Amino Acids
Acylcarnitines
ExplainableAI
Domínio/Área Científica::Engenharia e Tecnologia
title_short Machine learning models in decision support systems for diagnosing colorectal cancer based on metabolic profiles
title_full Machine learning models in decision support systems for diagnosing colorectal cancer based on metabolic profiles
title_fullStr Machine learning models in decision support systems for diagnosing colorectal cancer based on metabolic profiles
title_full_unstemmed Machine learning models in decision support systems for diagnosing colorectal cancer based on metabolic profiles
title_sort Machine learning models in decision support systems for diagnosing colorectal cancer based on metabolic profiles
author Barbosa, Rui Xavier Ferreira
author_facet Barbosa, Rui Xavier Ferreira
author_role author
dc.contributor.none.fl_str_mv Tavares, José Antonio Reis
Repositório Científico do Instituto Politécnico do Porto
dc.contributor.author.fl_str_mv Barbosa, Rui Xavier Ferreira
dc.subject.por.fl_str_mv Colorectal Cancer
Machine Learning
Amino Acids
Acylcarnitines
ExplainableAI
Domínio/Área Científica::Engenharia e Tecnologia
topic Colorectal Cancer
Machine Learning
Amino Acids
Acylcarnitines
ExplainableAI
Domínio/Área Científica::Engenharia e Tecnologia
description In today’s ever-evolving technological landscape, the volume of data across sectors is grow ing, particularly in healthcare. Here, the gathering and processing of biochemical data aim to refine decision-making for patient treatments, especially using tools based on Machine Learning (ML). As a subset of Artificial Intelligence, ML harnesses algorithms to predict outcomes or unearth patterns that might otherwise remain concealed. The interpretability of ML models is pivotal, enabling healthcare professionals to place con fidence in and decipher the model’s predictions. This assumes particular significance when decisions could directly affect patient lives. This research embarked on an in-depth exploration of various ML algorithms and techniques to discern whether the combined metabolic profiles of amino acids and acylcarnitines might serve as new biochemical indicators for predicting colo-rectal cancer prognosis. Throughout this study, several algorithms and data preprocessing techniques were evaluated. Four distinct experiments validated the predictions of the models in different scenarios. These scenarios involved predicting Colorectal Cancer using amino acids with and without the age parameter, and similarly, using acylcarnitine with and without the age parameter. Each scenario’s predictions were elucidated using SHAP, both for overarching feature significance and individual instances. Preliminary analyses indicated that the constructed models demonstrated promising predic tive power, with notable variations for the different scenarios. Amongst the algorithms tested, Random Forest, Support Vector Machine, Gaussian Naive Bayes, and Gradient Boosting emerged as the top performers.
publishDate 2023
dc.date.none.fl_str_mv 2023-12-20T09:07:03Z
2023-11-17
2023-11-17T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10400.22/24307
TID:203414543
url http://hdl.handle.net/10400.22/24307
identifier_str_mv TID:203414543
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799136447638274048