Machine learning models in decision support systems for diagnosing colorectal cancer based on metabolic profiles
Autor(a) principal: | |
---|---|
Data de Publicação: | 2023 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10400.22/24307 |
Resumo: | In today’s ever-evolving technological landscape, the volume of data across sectors is grow ing, particularly in healthcare. Here, the gathering and processing of biochemical data aim to refine decision-making for patient treatments, especially using tools based on Machine Learning (ML). As a subset of Artificial Intelligence, ML harnesses algorithms to predict outcomes or unearth patterns that might otherwise remain concealed. The interpretability of ML models is pivotal, enabling healthcare professionals to place con fidence in and decipher the model’s predictions. This assumes particular significance when decisions could directly affect patient lives. This research embarked on an in-depth exploration of various ML algorithms and techniques to discern whether the combined metabolic profiles of amino acids and acylcarnitines might serve as new biochemical indicators for predicting colo-rectal cancer prognosis. Throughout this study, several algorithms and data preprocessing techniques were evaluated. Four distinct experiments validated the predictions of the models in different scenarios. These scenarios involved predicting Colorectal Cancer using amino acids with and without the age parameter, and similarly, using acylcarnitine with and without the age parameter. Each scenario’s predictions were elucidated using SHAP, both for overarching feature significance and individual instances. Preliminary analyses indicated that the constructed models demonstrated promising predic tive power, with notable variations for the different scenarios. Amongst the algorithms tested, Random Forest, Support Vector Machine, Gaussian Naive Bayes, and Gradient Boosting emerged as the top performers. |
id |
RCAP_3ad4a273e6604371b93ad3ab39d011aa |
---|---|
oai_identifier_str |
oai:recipp.ipp.pt:10400.22/24307 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Machine learning models in decision support systems for diagnosing colorectal cancer based on metabolic profilesColorectal CancerMachine LearningAmino AcidsAcylcarnitinesExplainableAIDomínio/Área Científica::Engenharia e TecnologiaIn today’s ever-evolving technological landscape, the volume of data across sectors is grow ing, particularly in healthcare. Here, the gathering and processing of biochemical data aim to refine decision-making for patient treatments, especially using tools based on Machine Learning (ML). As a subset of Artificial Intelligence, ML harnesses algorithms to predict outcomes or unearth patterns that might otherwise remain concealed. The interpretability of ML models is pivotal, enabling healthcare professionals to place con fidence in and decipher the model’s predictions. This assumes particular significance when decisions could directly affect patient lives. This research embarked on an in-depth exploration of various ML algorithms and techniques to discern whether the combined metabolic profiles of amino acids and acylcarnitines might serve as new biochemical indicators for predicting colo-rectal cancer prognosis. Throughout this study, several algorithms and data preprocessing techniques were evaluated. Four distinct experiments validated the predictions of the models in different scenarios. These scenarios involved predicting Colorectal Cancer using amino acids with and without the age parameter, and similarly, using acylcarnitine with and without the age parameter. Each scenario’s predictions were elucidated using SHAP, both for overarching feature significance and individual instances. Preliminary analyses indicated that the constructed models demonstrated promising predic tive power, with notable variations for the different scenarios. Amongst the algorithms tested, Random Forest, Support Vector Machine, Gaussian Naive Bayes, and Gradient Boosting emerged as the top performers.No atual panorama tecnológico em constante evolução, o volume de dados em diversos setores está a aumentar, particularmente na saúde. Aqui, a recolha e processamento de dados bioquímicos visam aprimorar a tomada de decisão para tratamentos de pacientes, especialmente utilizando ferramentas baseadas em Aprendizagem Automática. Como um subconjunto da Inteligência Artificial, a Aprendizagem Automática utiliza algoritmos para prever resultados ou descobrir padrões que de outra forma poderiam permanecer ocultos. A interpretabilidade dos modelos de Aprendizagem Automática é fundamental, permitindo que os profissionais de saúde confiem e decifrem as previsões do modelo. Isto assume uma importância particular quando as decisões podem afetar diretamente a vida dos pacientes. Esta investigação levou a cabo uma exploração aprofundada de vários algoritmos e téc nicas de Aprendizagem Automática para determinar se os perfis metabólicos combinados de aminoácidos e acilcarnitinas poderiam servir como novos indicadores bioquímicos para a previsão e prognóstico do cancro colo-retal. Ao longo deste estudo, vários algoritmos e técnicas de pré-processamento de dados foram avaliados. Quatro experiências distintas validaram as previsões dos modelos em diferentes cenários. Estes cenários envolveram a previsão de Cancro Colorretal usando aminoácidos com e sem o atributo idade, e de forma semelhante, usando acilcarnitinas. As previsões de cada cenário foram elucidadas usando o SHAP, tanto para a importância geral dos atributos como para amostras individuais. Análises preliminares indicaram que os modelos construídos mostraram um poder preditivo promissor, com variações notáveis nos diferentes cenários. Entre os algoritmos testados, Random Forest, Support Vector Machines, Naive Bayes e Gradient Boosting destacaram-se com melhor desempenho.Tavares, José Antonio ReisRepositório Científico do Instituto Politécnico do PortoBarbosa, Rui Xavier Ferreira2023-12-20T09:07:03Z2023-11-172023-11-17T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10400.22/24307TID:203414543enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-12-27T01:49:07Zoai:recipp.ipp.pt:10400.22/24307Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T00:56:13.697212Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Machine learning models in decision support systems for diagnosing colorectal cancer based on metabolic profiles |
title |
Machine learning models in decision support systems for diagnosing colorectal cancer based on metabolic profiles |
spellingShingle |
Machine learning models in decision support systems for diagnosing colorectal cancer based on metabolic profiles Barbosa, Rui Xavier Ferreira Colorectal Cancer Machine Learning Amino Acids Acylcarnitines ExplainableAI Domínio/Área Científica::Engenharia e Tecnologia |
title_short |
Machine learning models in decision support systems for diagnosing colorectal cancer based on metabolic profiles |
title_full |
Machine learning models in decision support systems for diagnosing colorectal cancer based on metabolic profiles |
title_fullStr |
Machine learning models in decision support systems for diagnosing colorectal cancer based on metabolic profiles |
title_full_unstemmed |
Machine learning models in decision support systems for diagnosing colorectal cancer based on metabolic profiles |
title_sort |
Machine learning models in decision support systems for diagnosing colorectal cancer based on metabolic profiles |
author |
Barbosa, Rui Xavier Ferreira |
author_facet |
Barbosa, Rui Xavier Ferreira |
author_role |
author |
dc.contributor.none.fl_str_mv |
Tavares, José Antonio Reis Repositório Científico do Instituto Politécnico do Porto |
dc.contributor.author.fl_str_mv |
Barbosa, Rui Xavier Ferreira |
dc.subject.por.fl_str_mv |
Colorectal Cancer Machine Learning Amino Acids Acylcarnitines ExplainableAI Domínio/Área Científica::Engenharia e Tecnologia |
topic |
Colorectal Cancer Machine Learning Amino Acids Acylcarnitines ExplainableAI Domínio/Área Científica::Engenharia e Tecnologia |
description |
In today’s ever-evolving technological landscape, the volume of data across sectors is grow ing, particularly in healthcare. Here, the gathering and processing of biochemical data aim to refine decision-making for patient treatments, especially using tools based on Machine Learning (ML). As a subset of Artificial Intelligence, ML harnesses algorithms to predict outcomes or unearth patterns that might otherwise remain concealed. The interpretability of ML models is pivotal, enabling healthcare professionals to place con fidence in and decipher the model’s predictions. This assumes particular significance when decisions could directly affect patient lives. This research embarked on an in-depth exploration of various ML algorithms and techniques to discern whether the combined metabolic profiles of amino acids and acylcarnitines might serve as new biochemical indicators for predicting colo-rectal cancer prognosis. Throughout this study, several algorithms and data preprocessing techniques were evaluated. Four distinct experiments validated the predictions of the models in different scenarios. These scenarios involved predicting Colorectal Cancer using amino acids with and without the age parameter, and similarly, using acylcarnitine with and without the age parameter. Each scenario’s predictions were elucidated using SHAP, both for overarching feature significance and individual instances. Preliminary analyses indicated that the constructed models demonstrated promising predic tive power, with notable variations for the different scenarios. Amongst the algorithms tested, Random Forest, Support Vector Machine, Gaussian Naive Bayes, and Gradient Boosting emerged as the top performers. |
publishDate |
2023 |
dc.date.none.fl_str_mv |
2023-12-20T09:07:03Z 2023-11-17 2023-11-17T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10400.22/24307 TID:203414543 |
url |
http://hdl.handle.net/10400.22/24307 |
identifier_str_mv |
TID:203414543 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799136447638274048 |