To be or NOT to be: The Impact of Negative Annotation in Biomedical Semantic Similarity

Detalhes bibliográficos
Autor(a) principal: Aveiro, Lina Andreia Gama
Data de Publicação: 2021
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10451/53810
Resumo: Tese de mestrado, Bioinformática e Biologia Computacional, Universidade de Lisboa, Faculdade de Ciências, 2022
id RCAP_65f1c930c6e1f17eb22156060c727937
oai_identifier_str oai:repositorio.ul.pt:10451/53810
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling To be or NOT to be: The Impact of Negative Annotation in Biomedical Semantic SimilaritySemelhança SemânticaOntologia biomédicaAnotação negativaPrevisão Interação Proteína-ProteínaPrevisão de doençaTeses de mestrado - 2022Departamento de InformáticaTese de mestrado, Bioinformática e Biologia Computacional, Universidade de Lisboa, Faculdade de Ciências, 2022Classical Semantic Similarity Measures did not consider negative annotations in similarity compu tation, and the impact that these annotations can have in this data mining technique is not well studied. As such, this work aims to understand how the addition of negative annotations impacts semantic sim ilarity. To do so, two pairwise similarity measures, Best-Match Average and Resnik, were adapted to create the polar measures PolarBMA and PolarResnik. These were evaluated in two currently relevant scopes: protein-protein interaction prediction and disease prediction against the original measures. Pairs of proteins where the proteins were known to interact or not were taken from STRING and enriched with positive and negative annotations from the Gene Ontology. Synthetic patients were created as sets of annotations taken from the Mendelian diseases they were designed to have, as well as possible noise or imprecise annotations. Then semantic similarity was computed with both polar and non-polar measures between proteins in pairs and between patients and candidate diseases including the Mendelian diseases, as well as random diseases taken from the Human Phenotype Ontology. To evaluate if the polar measures performed well in comparison to the baseline, a ranking according to semantic similarity was made for each measure and scope for evaluation and the rank cumulative frequencies were plotted. ROC AUC and Precision-Recall curves were also determined for the Protein Protein interaction(PPI) prediction, as well as average precision for the disease prediction dataset. In PPI prediction, polar measures had an increased performance in the Molecular Function branch for both experiments where negative annotations were added and also in one of the experiments with the Cellular Component branch. In the disease prediction scope, polar measures had an improved performance of approximately ten percent. This improvement was verified in all disease prediction experiments, even with the addition of noise and imprecision. Considering the results obtained, this work concludes that negative annotations have an impact on semantic similarity, but the amplitude of this impact requires further study.Pesquita, Cátia, 1980-Repositório da Universidade de LisboaAveiro, Lina Andreia Gama2022-07-18T08:16:57Z202220212022-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10451/53810enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-08T17:00:01Zoai:repositorio.ul.pt:10451/53810Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T22:04:48.358435Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv To be or NOT to be: The Impact of Negative Annotation in Biomedical Semantic Similarity
title To be or NOT to be: The Impact of Negative Annotation in Biomedical Semantic Similarity
spellingShingle To be or NOT to be: The Impact of Negative Annotation in Biomedical Semantic Similarity
Aveiro, Lina Andreia Gama
Semelhança Semântica
Ontologia biomédica
Anotação negativa
Previsão Interação Proteína-Proteína
Previsão de doença
Teses de mestrado - 2022
Departamento de Informática
title_short To be or NOT to be: The Impact of Negative Annotation in Biomedical Semantic Similarity
title_full To be or NOT to be: The Impact of Negative Annotation in Biomedical Semantic Similarity
title_fullStr To be or NOT to be: The Impact of Negative Annotation in Biomedical Semantic Similarity
title_full_unstemmed To be or NOT to be: The Impact of Negative Annotation in Biomedical Semantic Similarity
title_sort To be or NOT to be: The Impact of Negative Annotation in Biomedical Semantic Similarity
author Aveiro, Lina Andreia Gama
author_facet Aveiro, Lina Andreia Gama
author_role author
dc.contributor.none.fl_str_mv Pesquita, Cátia, 1980-
Repositório da Universidade de Lisboa
dc.contributor.author.fl_str_mv Aveiro, Lina Andreia Gama
dc.subject.por.fl_str_mv Semelhança Semântica
Ontologia biomédica
Anotação negativa
Previsão Interação Proteína-Proteína
Previsão de doença
Teses de mestrado - 2022
Departamento de Informática
topic Semelhança Semântica
Ontologia biomédica
Anotação negativa
Previsão Interação Proteína-Proteína
Previsão de doença
Teses de mestrado - 2022
Departamento de Informática
description Tese de mestrado, Bioinformática e Biologia Computacional, Universidade de Lisboa, Faculdade de Ciências, 2022
publishDate 2021
dc.date.none.fl_str_mv 2021
2022-07-18T08:16:57Z
2022
2022-01-01T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10451/53810
url http://hdl.handle.net/10451/53810
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799134599501053952