Social mining for the classification of mental illnesses in public forums
Autor(a) principal: | |
---|---|
Data de Publicação: | 2022 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10773/38990 |
Resumo: | The increasing amount of mental health issues is one of the biggest adversities that we face nowadays as a society, and the traditional assistance methods often fail to help those in need. In this work, we implement and evaluate the performance of a screening tool that may complement some of the traditional methods’ weaknesses, by signalling subjects at risk of developing mental illnesses, that could benefit from receiving medical assistance. This tool is based on machine learning, and it detects individuals at risk using their publicly available data from the social network Reddit. This work was based on our participation in tasks 1 and 2 of the 2022 edition of CLEF eRisk, with the goal of detecting subjects at risk of pathological gambling and depression respectively, where we had a special focus on the use and comparison of different text vectorization methods. Despite the fact that the initial results obtained at the event were far from those desired, with some tweaks and additional experiments, we managed to improve them, achieving final F1-scores of 0.886 and 0.653 for the best models of tasks 1 and 2 respectively. |
id |
RCAP_94b6ea0b83d5d4fbbff99543c578c23b |
---|---|
oai_identifier_str |
oai:ria.ua.pt:10773/38990 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Social mining for the classification of mental illnesses in public forumsData miningMachine learningNatural language processingMental healthThe increasing amount of mental health issues is one of the biggest adversities that we face nowadays as a society, and the traditional assistance methods often fail to help those in need. In this work, we implement and evaluate the performance of a screening tool that may complement some of the traditional methods’ weaknesses, by signalling subjects at risk of developing mental illnesses, that could benefit from receiving medical assistance. This tool is based on machine learning, and it detects individuals at risk using their publicly available data from the social network Reddit. This work was based on our participation in tasks 1 and 2 of the 2022 edition of CLEF eRisk, with the goal of detecting subjects at risk of pathological gambling and depression respectively, where we had a special focus on the use and comparison of different text vectorization methods. Despite the fact that the initial results obtained at the event were far from those desired, with some tweaks and additional experiments, we managed to improve them, achieving final F1-scores of 0.886 and 0.653 for the best models of tasks 1 and 2 respectively.O aumento de problemas de saúde mental é uma das maiores adversidades que enfrentamos atualmente, enquanto sociedade, e os métodos de assistência tradicionais nem sempre conseguem assistir quem precisa. Neste trabalho implementamos e avaliamos a eficácia de uma ferramenta de triagem que pode complementar alguns dos pontos fracos dos métodos tradicionais, ao sinalizar sujeitos em risco de desenvolver doenças mentais, que podem beneficiar de assistência médica. Esta ferramenta é baseada em aprendizagem de máquina, e deteta os indivíduos em risco, analisando os seus dados disponíveis públicamente na rede social Reddit. Este trabalho teve como base a participação na edição de 2022 do CLEF eRisk, nos desafios 1 e 2, com o objetivo de detetar sujeitos em risco de serem jogadores compulsivos, e de desenvolverem depressão respetivamente, onde tivemos como foco, o uso e comparação de diferentes métodos de vetorização de texto. Apesar dos resultados iniciais obtidos no evento não terem sido os melhores, com afinamentos e experiências adicionais, conseguimos obter um bom desempenho, com F1-scores finais de 0.886 e 0.653 para os melhores modelos dos desafios 1 e 2 respetivamente.2023-07-25T10:30:53Z2022-12-12T00:00:00Z2022-12-12info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10773/38990engFerreira, Rodrigo Miguel Maiainfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-02-22T12:16:03Zoai:ria.ua.pt:10773/38990Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:09:13.067922Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Social mining for the classification of mental illnesses in public forums |
title |
Social mining for the classification of mental illnesses in public forums |
spellingShingle |
Social mining for the classification of mental illnesses in public forums Ferreira, Rodrigo Miguel Maia Data mining Machine learning Natural language processing Mental health |
title_short |
Social mining for the classification of mental illnesses in public forums |
title_full |
Social mining for the classification of mental illnesses in public forums |
title_fullStr |
Social mining for the classification of mental illnesses in public forums |
title_full_unstemmed |
Social mining for the classification of mental illnesses in public forums |
title_sort |
Social mining for the classification of mental illnesses in public forums |
author |
Ferreira, Rodrigo Miguel Maia |
author_facet |
Ferreira, Rodrigo Miguel Maia |
author_role |
author |
dc.contributor.author.fl_str_mv |
Ferreira, Rodrigo Miguel Maia |
dc.subject.por.fl_str_mv |
Data mining Machine learning Natural language processing Mental health |
topic |
Data mining Machine learning Natural language processing Mental health |
description |
The increasing amount of mental health issues is one of the biggest adversities that we face nowadays as a society, and the traditional assistance methods often fail to help those in need. In this work, we implement and evaluate the performance of a screening tool that may complement some of the traditional methods’ weaknesses, by signalling subjects at risk of developing mental illnesses, that could benefit from receiving medical assistance. This tool is based on machine learning, and it detects individuals at risk using their publicly available data from the social network Reddit. This work was based on our participation in tasks 1 and 2 of the 2022 edition of CLEF eRisk, with the goal of detecting subjects at risk of pathological gambling and depression respectively, where we had a special focus on the use and comparison of different text vectorization methods. Despite the fact that the initial results obtained at the event were far from those desired, with some tweaks and additional experiments, we managed to improve them, achieving final F1-scores of 0.886 and 0.653 for the best models of tasks 1 and 2 respectively. |
publishDate |
2022 |
dc.date.none.fl_str_mv |
2022-12-12T00:00:00Z 2022-12-12 2023-07-25T10:30:53Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10773/38990 |
url |
http://hdl.handle.net/10773/38990 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799137743483174912 |