Social mining for the classification of mental illnesses in public forums

Detalhes bibliográficos
Autor(a) principal: Ferreira, Rodrigo Miguel Maia
Data de Publicação: 2022
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10773/38990
Resumo: The increasing amount of mental health issues is one of the biggest adversities that we face nowadays as a society, and the traditional assistance methods often fail to help those in need. In this work, we implement and evaluate the performance of a screening tool that may complement some of the traditional methods’ weaknesses, by signalling subjects at risk of developing mental illnesses, that could benefit from receiving medical assistance. This tool is based on machine learning, and it detects individuals at risk using their publicly available data from the social network Reddit. This work was based on our participation in tasks 1 and 2 of the 2022 edition of CLEF eRisk, with the goal of detecting subjects at risk of pathological gambling and depression respectively, where we had a special focus on the use and comparison of different text vectorization methods. Despite the fact that the initial results obtained at the event were far from those desired, with some tweaks and additional experiments, we managed to improve them, achieving final F1-scores of 0.886 and 0.653 for the best models of tasks 1 and 2 respectively.
id RCAP_94b6ea0b83d5d4fbbff99543c578c23b
oai_identifier_str oai:ria.ua.pt:10773/38990
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Social mining for the classification of mental illnesses in public forumsData miningMachine learningNatural language processingMental healthThe increasing amount of mental health issues is one of the biggest adversities that we face nowadays as a society, and the traditional assistance methods often fail to help those in need. In this work, we implement and evaluate the performance of a screening tool that may complement some of the traditional methods’ weaknesses, by signalling subjects at risk of developing mental illnesses, that could benefit from receiving medical assistance. This tool is based on machine learning, and it detects individuals at risk using their publicly available data from the social network Reddit. This work was based on our participation in tasks 1 and 2 of the 2022 edition of CLEF eRisk, with the goal of detecting subjects at risk of pathological gambling and depression respectively, where we had a special focus on the use and comparison of different text vectorization methods. Despite the fact that the initial results obtained at the event were far from those desired, with some tweaks and additional experiments, we managed to improve them, achieving final F1-scores of 0.886 and 0.653 for the best models of tasks 1 and 2 respectively.O aumento de problemas de saúde mental é uma das maiores adversidades que enfrentamos atualmente, enquanto sociedade, e os métodos de assistência tradicionais nem sempre conseguem assistir quem precisa. Neste trabalho implementamos e avaliamos a eficácia de uma ferramenta de triagem que pode complementar alguns dos pontos fracos dos métodos tradicionais, ao sinalizar sujeitos em risco de desenvolver doenças mentais, que podem beneficiar de assistência médica. Esta ferramenta é baseada em aprendizagem de máquina, e deteta os indivíduos em risco, analisando os seus dados disponíveis públicamente na rede social Reddit. Este trabalho teve como base a participação na edição de 2022 do CLEF eRisk, nos desafios 1 e 2, com o objetivo de detetar sujeitos em risco de serem jogadores compulsivos, e de desenvolverem depressão respetivamente, onde tivemos como foco, o uso e comparação de diferentes métodos de vetorização de texto. Apesar dos resultados iniciais obtidos no evento não terem sido os melhores, com afinamentos e experiências adicionais, conseguimos obter um bom desempenho, com F1-scores finais de 0.886 e 0.653 para os melhores modelos dos desafios 1 e 2 respetivamente.2023-07-25T10:30:53Z2022-12-12T00:00:00Z2022-12-12info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10773/38990engFerreira, Rodrigo Miguel Maiainfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-02-22T12:16:03Zoai:ria.ua.pt:10773/38990Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:09:13.067922Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Social mining for the classification of mental illnesses in public forums
title Social mining for the classification of mental illnesses in public forums
spellingShingle Social mining for the classification of mental illnesses in public forums
Ferreira, Rodrigo Miguel Maia
Data mining
Machine learning
Natural language processing
Mental health
title_short Social mining for the classification of mental illnesses in public forums
title_full Social mining for the classification of mental illnesses in public forums
title_fullStr Social mining for the classification of mental illnesses in public forums
title_full_unstemmed Social mining for the classification of mental illnesses in public forums
title_sort Social mining for the classification of mental illnesses in public forums
author Ferreira, Rodrigo Miguel Maia
author_facet Ferreira, Rodrigo Miguel Maia
author_role author
dc.contributor.author.fl_str_mv Ferreira, Rodrigo Miguel Maia
dc.subject.por.fl_str_mv Data mining
Machine learning
Natural language processing
Mental health
topic Data mining
Machine learning
Natural language processing
Mental health
description The increasing amount of mental health issues is one of the biggest adversities that we face nowadays as a society, and the traditional assistance methods often fail to help those in need. In this work, we implement and evaluate the performance of a screening tool that may complement some of the traditional methods’ weaknesses, by signalling subjects at risk of developing mental illnesses, that could benefit from receiving medical assistance. This tool is based on machine learning, and it detects individuals at risk using their publicly available data from the social network Reddit. This work was based on our participation in tasks 1 and 2 of the 2022 edition of CLEF eRisk, with the goal of detecting subjects at risk of pathological gambling and depression respectively, where we had a special focus on the use and comparison of different text vectorization methods. Despite the fact that the initial results obtained at the event were far from those desired, with some tweaks and additional experiments, we managed to improve them, achieving final F1-scores of 0.886 and 0.653 for the best models of tasks 1 and 2 respectively.
publishDate 2022
dc.date.none.fl_str_mv 2022-12-12T00:00:00Z
2022-12-12
2023-07-25T10:30:53Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10773/38990
url http://hdl.handle.net/10773/38990
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799137743483174912