Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning

Detalhes bibliográficos
Autor(a) principal: Asif, Muhammad
Data de Publicação: 2020
Outros Autores: Martiniano, Hugo F.M.C., Marques, Ana Rita, Santos, João Xavier, Vilela, Joana, Rasga, Celia, Oliveira, Guiomar, Couto, Francisco M., Vicente, Astrid M.
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10400.18/7319
Resumo: The complex genetic architecture of Autism Spectrum Disorder (ASD) and its heterogeneous phenotype makes molecular diagnosis and patient prognosis challenging tasks. To establish more precise genotype-phenotype correlations in ASD, we developed a novel machine-learning integrative approach, which seeks to delineate associations between patients' clinical profiles and disrupted biological processes, inferred from their copy number variants (CNVs) that span brain genes. Clustering analysis of the relevant clinical measures from 2446 ASD cases in the Autism Genome Project identified two distinct phenotypic subgroups. Patients in these clusters differed significantly in ADOS-defined severity, adaptive behavior profiles, intellectual ability, and verbal status, the latter contributing the most for cluster stability and cohesion. Functional enrichment analysis of brain genes disrupted by CNVs in these ASD cases identified 15 statistically significant biological processes, including cell adhesion, neural development, cognition, and polyubiquitination, in line with previous ASD findings. A Naive Bayes classifier, generated to predict the ASD phenotypic clusters from disrupted biological processes, achieved predictions with a high precision (0.82) but low recall (0.39), for a subset of patients with higher biological Information Content scores. This study shows that milder and more severe clinical presentations can have distinct underlying biological mechanisms. It further highlights how machine-learning approaches can reduce clinical heterogeneity by using multidimensional clinical measures, and establishes genotype-phenotype correlations in ASD. However, predictions are strongly dependent on patient's information content. Findings are therefore a first step toward the translation of genetic information into clinically useful applications, and emphasize the need for larger datasets with very complete clinical and biological information.
id RCAP_d3b562cecdcc6bd5c975c9b9c70ffab6
oai_identifier_str oai:repositorio.insa.pt:10400.18/7319
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learningAutismAutism Spectrum Disorder (ASD)Neurodevelopmental DisorderASD PhenotypePerturbações do Desenvolvimento Infantil e Saúde MentalThe complex genetic architecture of Autism Spectrum Disorder (ASD) and its heterogeneous phenotype makes molecular diagnosis and patient prognosis challenging tasks. To establish more precise genotype-phenotype correlations in ASD, we developed a novel machine-learning integrative approach, which seeks to delineate associations between patients' clinical profiles and disrupted biological processes, inferred from their copy number variants (CNVs) that span brain genes. Clustering analysis of the relevant clinical measures from 2446 ASD cases in the Autism Genome Project identified two distinct phenotypic subgroups. Patients in these clusters differed significantly in ADOS-defined severity, adaptive behavior profiles, intellectual ability, and verbal status, the latter contributing the most for cluster stability and cohesion. Functional enrichment analysis of brain genes disrupted by CNVs in these ASD cases identified 15 statistically significant biological processes, including cell adhesion, neural development, cognition, and polyubiquitination, in line with previous ASD findings. A Naive Bayes classifier, generated to predict the ASD phenotypic clusters from disrupted biological processes, achieved predictions with a high precision (0.82) but low recall (0.39), for a subset of patients with higher biological Information Content scores. This study shows that milder and more severe clinical presentations can have distinct underlying biological mechanisms. It further highlights how machine-learning approaches can reduce clinical heterogeneity by using multidimensional clinical measures, and establishes genotype-phenotype correlations in ASD. However, predictions are strongly dependent on patient's information content. Findings are therefore a first step toward the translation of genetic information into clinically useful applications, and emphasize the need for larger datasets with very complete clinical and biological information.The work was supported by Portuguese Fundação para a Ciência e Tecnologia (FCT) through funding to BioISI (Ref: UID/MULTI/04046/2013), LASIGE Research Unit (Ref: UID/CEC/00408/2019), and to DeST: Deep Semantic Tagger project (Ref: PTDC/CCI-BIO/28685/2017). M.A., A.R.M., J.X.S., and J.V. were the recipients of BioSys PhD programme fellowship from FCT (Portugal) with references PD/BD/52485/2014, PD/BD/113773/2015, PD/BD/114386/2016, and PD/BD/\131390/2017, respectively. C.R. is the recipient of a grant from FCT (Ref: POCI01-0145-FEDER-016428). Patients and parents were genotyped in the context of the Autism Genome Project (AGP), funded by NIMH, HRB, MRC, Autism Speaks, Hilibrand Foundation, Genome Canada, OGI, and CIHR. We acknowledge the families who participated in these projects.Springer NatureRepositório Científico do Instituto Nacional de SaúdeAsif, MuhammadMartiniano, Hugo F.M.C.Marques, Ana RitaSantos, João XavierVilela, JoanaRasga, CeliaOliveira, GuiomarCouto, Francisco M.Vicente, Astrid M.2021-03-04T18:58:25Z2020-01-282020-01-28T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10400.18/7319engTransl Psychiatry. 2020 Jan 28;10(1):43. doi: 10.1038/s41398-020-0721-1.2158-318810.1038/s41398-020-0721-1info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-07-20T15:41:47Zoai:repositorio.insa.pt:10400.18/7319Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T18:41:44.918906Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning
title Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning
spellingShingle Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning
Asif, Muhammad
Autism
Autism Spectrum Disorder (ASD)
Neurodevelopmental Disorder
ASD Phenotype
Perturbações do Desenvolvimento Infantil e Saúde Mental
title_short Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning
title_full Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning
title_fullStr Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning
title_full_unstemmed Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning
title_sort Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning
author Asif, Muhammad
author_facet Asif, Muhammad
Martiniano, Hugo F.M.C.
Marques, Ana Rita
Santos, João Xavier
Vilela, Joana
Rasga, Celia
Oliveira, Guiomar
Couto, Francisco M.
Vicente, Astrid M.
author_role author
author2 Martiniano, Hugo F.M.C.
Marques, Ana Rita
Santos, João Xavier
Vilela, Joana
Rasga, Celia
Oliveira, Guiomar
Couto, Francisco M.
Vicente, Astrid M.
author2_role author
author
author
author
author
author
author
author
dc.contributor.none.fl_str_mv Repositório Científico do Instituto Nacional de Saúde
dc.contributor.author.fl_str_mv Asif, Muhammad
Martiniano, Hugo F.M.C.
Marques, Ana Rita
Santos, João Xavier
Vilela, Joana
Rasga, Celia
Oliveira, Guiomar
Couto, Francisco M.
Vicente, Astrid M.
dc.subject.por.fl_str_mv Autism
Autism Spectrum Disorder (ASD)
Neurodevelopmental Disorder
ASD Phenotype
Perturbações do Desenvolvimento Infantil e Saúde Mental
topic Autism
Autism Spectrum Disorder (ASD)
Neurodevelopmental Disorder
ASD Phenotype
Perturbações do Desenvolvimento Infantil e Saúde Mental
description The complex genetic architecture of Autism Spectrum Disorder (ASD) and its heterogeneous phenotype makes molecular diagnosis and patient prognosis challenging tasks. To establish more precise genotype-phenotype correlations in ASD, we developed a novel machine-learning integrative approach, which seeks to delineate associations between patients' clinical profiles and disrupted biological processes, inferred from their copy number variants (CNVs) that span brain genes. Clustering analysis of the relevant clinical measures from 2446 ASD cases in the Autism Genome Project identified two distinct phenotypic subgroups. Patients in these clusters differed significantly in ADOS-defined severity, adaptive behavior profiles, intellectual ability, and verbal status, the latter contributing the most for cluster stability and cohesion. Functional enrichment analysis of brain genes disrupted by CNVs in these ASD cases identified 15 statistically significant biological processes, including cell adhesion, neural development, cognition, and polyubiquitination, in line with previous ASD findings. A Naive Bayes classifier, generated to predict the ASD phenotypic clusters from disrupted biological processes, achieved predictions with a high precision (0.82) but low recall (0.39), for a subset of patients with higher biological Information Content scores. This study shows that milder and more severe clinical presentations can have distinct underlying biological mechanisms. It further highlights how machine-learning approaches can reduce clinical heterogeneity by using multidimensional clinical measures, and establishes genotype-phenotype correlations in ASD. However, predictions are strongly dependent on patient's information content. Findings are therefore a first step toward the translation of genetic information into clinically useful applications, and emphasize the need for larger datasets with very complete clinical and biological information.
publishDate 2020
dc.date.none.fl_str_mv 2020-01-28
2020-01-28T00:00:00Z
2021-03-04T18:58:25Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10400.18/7319
url http://hdl.handle.net/10400.18/7319
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv Transl Psychiatry. 2020 Jan 28;10(1):43. doi: 10.1038/s41398-020-0721-1.
2158-3188
10.1038/s41398-020-0721-1
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Springer Nature
publisher.none.fl_str_mv Springer Nature
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799132161782054912