nodeML - Towards reproducible ML in federated environments
Autor(a) principal: | |
---|---|
Data de Publicação: | 2022 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10400.22/21438 |
Resumo: | Advances and increasing interest in AI (Artificial Intelligence) in the field of health have created novel issues, namely explainability and reproducibility of ML (Machine Learning) models. In addition, while the training of ML models traditionally favors a centralized approach, scalability and privacy issues seem to lead towards a distributed one. The latter poses challenges to ML algorithms and the efficacy of learning itself. Reproducing ML models poses several challenges arising from the intrinsic variability of the models themselves and the environment where they are trained. This problem is aggravated by their lack of standardization and common terminology. The main goal of this work is to conceptualize and prototype a framework to train, evaluate and describe ML models, in a decentralized way, over immunogenetics datasets. This framework will promote model reproducibility and comparability, as well as its adaptability. This work will start by implementing a federated/decentralized training framework over existing ML pipelines. Then, it will be possible to list and select potential dataset sources, aiming to provide an easy path to model adaptation and optimization. |
id |
RCAP_34ecd3c9586ae601bf0cc53236af5c15 |
---|---|
oai_identifier_str |
oai:recipp.ipp.pt:10400.22/21438 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
nodeML - Towards reproducible ML in federated environmentsFederated learningDecentralizationMachine LearningImmunologyImmunotherapyGeneticsAdvances and increasing interest in AI (Artificial Intelligence) in the field of health have created novel issues, namely explainability and reproducibility of ML (Machine Learning) models. In addition, while the training of ML models traditionally favors a centralized approach, scalability and privacy issues seem to lead towards a distributed one. The latter poses challenges to ML algorithms and the efficacy of learning itself. Reproducing ML models poses several challenges arising from the intrinsic variability of the models themselves and the environment where they are trained. This problem is aggravated by their lack of standardization and common terminology. The main goal of this work is to conceptualize and prototype a framework to train, evaluate and describe ML models, in a decentralized way, over immunogenetics datasets. This framework will promote model reproducibility and comparability, as well as its adaptability. This work will start by implementing a federated/decentralized training framework over existing ML pipelines. Then, it will be possible to list and select potential dataset sources, aiming to provide an easy path to model adaptation and optimization.Os contínuos avanços e crescente interesse em IA (Inteligência Artificial) no campo da saúde levantaram novas questões, nomeadamente a explicabilidade e a reprodutibilidade de modelos de ML (Machine Learning). Adicionalmente, enquanto o treino de modelos de ML favorece tradicionalmente uma abordagem centralizada, questões de escalabilidade e privacidade tendem a levar a uma abordagem distribuída. Esta última apresenta desafios aos algoritmos de ML e à eficácia do treino em si. A reprodução de modelos de ML apresenta vários desafios decorrentes da variabilidade intrínseca dos próprios modelos e do ambiente onde são treinados. Este problema é agravado pela falta de padronização e terminologia comum. O principal objetivo deste trabalho é conceptualizar e prototipar uma framework para treinar, avaliar e descrever modelos de ML, de forma descentralizada, sobre conjuntos de dados imunogenéticos. Essa framework promoverá a reproducibilidade e comparabilidade dos modelos, bem como a sua adaptabilidade. Este trabalho começará com a implementação de uma framework de treino federado/descentralizado sobre pipelines de ML existentes. De seguida, será possível listar e selecionar potenciais fontes de dados, esperando facilitar a adaptação e otimização dos modelos.Faria, Luiz Felipe Rocha deRepositório Científico do Instituto Politécnico do PortoSilva, Edgar Simão da Mota e2023-01-11T15:03:45Z20222022-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10400.22/21438TID:203112628enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-03-13T13:17:17Zoai:recipp.ipp.pt:10400.22/21438Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T17:41:30.202705Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
nodeML - Towards reproducible ML in federated environments |
title |
nodeML - Towards reproducible ML in federated environments |
spellingShingle |
nodeML - Towards reproducible ML in federated environments Silva, Edgar Simão da Mota e Federated learning Decentralization Machine Learning Immunology Immunotherapy Genetics |
title_short |
nodeML - Towards reproducible ML in federated environments |
title_full |
nodeML - Towards reproducible ML in federated environments |
title_fullStr |
nodeML - Towards reproducible ML in federated environments |
title_full_unstemmed |
nodeML - Towards reproducible ML in federated environments |
title_sort |
nodeML - Towards reproducible ML in federated environments |
author |
Silva, Edgar Simão da Mota e |
author_facet |
Silva, Edgar Simão da Mota e |
author_role |
author |
dc.contributor.none.fl_str_mv |
Faria, Luiz Felipe Rocha de Repositório Científico do Instituto Politécnico do Porto |
dc.contributor.author.fl_str_mv |
Silva, Edgar Simão da Mota e |
dc.subject.por.fl_str_mv |
Federated learning Decentralization Machine Learning Immunology Immunotherapy Genetics |
topic |
Federated learning Decentralization Machine Learning Immunology Immunotherapy Genetics |
description |
Advances and increasing interest in AI (Artificial Intelligence) in the field of health have created novel issues, namely explainability and reproducibility of ML (Machine Learning) models. In addition, while the training of ML models traditionally favors a centralized approach, scalability and privacy issues seem to lead towards a distributed one. The latter poses challenges to ML algorithms and the efficacy of learning itself. Reproducing ML models poses several challenges arising from the intrinsic variability of the models themselves and the environment where they are trained. This problem is aggravated by their lack of standardization and common terminology. The main goal of this work is to conceptualize and prototype a framework to train, evaluate and describe ML models, in a decentralized way, over immunogenetics datasets. This framework will promote model reproducibility and comparability, as well as its adaptability. This work will start by implementing a federated/decentralized training framework over existing ML pipelines. Then, it will be possible to list and select potential dataset sources, aiming to provide an easy path to model adaptation and optimization. |
publishDate |
2022 |
dc.date.none.fl_str_mv |
2022 2022-01-01T00:00:00Z 2023-01-11T15:03:45Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10400.22/21438 TID:203112628 |
url |
http://hdl.handle.net/10400.22/21438 |
identifier_str_mv |
TID:203112628 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799131502922956800 |