Social mining for the classification of mental illnesses in public forums

Ferreira, Rodrigo Miguel Maia

Social mining for the classification of mental illnesses in public forums

Detalhes bibliográficos
Autor(a) principal:	Ferreira, Rodrigo Miguel Maia
Data de Publicação:	2022
Tipo de documento:	Dissertação
Idioma:	eng
Título da fonte:	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo:	http://hdl.handle.net/10773/38990
Resumo:	The increasing amount of mental health issues is one of the biggest adversities that we face nowadays as a society, and the traditional assistance methods often fail to help those in need. In this work, we implement and evaluate the performance of a screening tool that may complement some of the traditional methods’ weaknesses, by signalling subjects at risk of developing mental illnesses, that could benefit from receiving medical assistance. This tool is based on machine learning, and it detects individuals at risk using their publicly available data from the social network Reddit. This work was based on our participation in tasks 1 and 2 of the 2022 edition of CLEF eRisk, with the goal of detecting subjects at risk of pathological gambling and depression respectively, where we had a special focus on the use and comparison of different text vectorization methods. Despite the fact that the initial results obtained at the event were far from those desired, with some tweaks and additional experiments, we managed to improve them, achieving final F1-scores of 0.886 and 0.653 for the best models of tasks 1 and 2 respectively.

Metadados do item

id	RCAP_94b6ea0b83d5d4fbbff99543c578c23b
oai_identifier_str	oai:ria.ua.pt:10773/38990
network_acronym_str	RCAP
network_name_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str	7160
spelling	Social mining for the classification of mental illnesses in public forumsData miningMachine learningNatural language processingMental healthThe increasing amount of mental health issues is one of the biggest adversities that we face nowadays as a society, and the traditional assistance methods often fail to help those in need. In this work, we implement and evaluate the performance of a screening tool that may complement some of the traditional methods’ weaknesses, by signalling subjects at risk of developing mental illnesses, that could benefit from receiving medical assistance. This tool is based on machine learning, and it detects individuals at risk using their publicly available data from the social network Reddit. This work was based on our participation in tasks 1 and 2 of the 2022 edition of CLEF eRisk, with the goal of detecting subjects at risk of pathological gambling and depression respectively, where we had a special focus on the use and comparison of different text vectorization methods. Despite the fact that the initial results obtained at the event were far from those desired, with some tweaks and additional experiments, we managed to improve them, achieving final F1-scores of 0.886 and 0.653 for the best models of tasks 1 and 2 respectively.O aumento de problemas de saúde mental é uma das maiores adversidades que enfrentamos atualmente, enquanto sociedade, e os métodos de assistência tradicionais nem sempre conseguem assistir quem precisa. Neste trabalho implementamos e avaliamos a eficácia de uma ferramenta de triagem que pode complementar alguns dos pontos fracos dos métodos tradicionais, ao sinalizar sujeitos em risco de desenvolver doenças mentais, que podem beneficiar de assistência médica. Esta ferramenta é baseada em aprendizagem de máquina, e deteta os indivíduos em risco, analisando os seus dados disponíveis públicamente na rede social Reddit. Este trabalho teve como base a participação na edição de 2022 do CLEF eRisk, nos desafios 1 e 2, com o objetivo de detetar sujeitos em risco de serem jogadores compulsivos, e de desenvolverem depressão respetivamente, onde tivemos como foco, o uso e comparação de diferentes métodos de vetorização de texto. Apesar dos resultados iniciais obtidos no evento não terem sido os melhores, com afinamentos e experiências adicionais, conseguimos obter um bom desempenho, com F1-scores finais de 0.886 e 0.653 para os melhores modelos dos desafios 1 e 2 respetivamente.2023-07-25T10:30:53Z2022-12-12T00:00:00Z2022-12-12info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10773/38990engFerreira, Rodrigo Miguel Maiainfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-02-22T12:16:03Zoai:ria.ua.pt:10773/38990Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:09:13.067922Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv	Social mining for the classification of mental illnesses in public forums
title	Social mining for the classification of mental illnesses in public forums
spellingShingle	Social mining for the classification of mental illnesses in public forums Ferreira, Rodrigo Miguel Maia Data mining Machine learning Natural language processing Mental health
title_short	Social mining for the classification of mental illnesses in public forums
title_full	Social mining for the classification of mental illnesses in public forums
title_fullStr	Social mining for the classification of mental illnesses in public forums
title_full_unstemmed	Social mining for the classification of mental illnesses in public forums
title_sort	Social mining for the classification of mental illnesses in public forums
author	Ferreira, Rodrigo Miguel Maia
author_facet	Ferreira, Rodrigo Miguel Maia
author_role	author
dc.contributor.author.fl_str_mv	Ferreira, Rodrigo Miguel Maia
dc.subject.por.fl_str_mv	Data mining Machine learning Natural language processing Mental health
topic	Data mining Machine learning Natural language processing Mental health
description	The increasing amount of mental health issues is one of the biggest adversities that we face nowadays as a society, and the traditional assistance methods often fail to help those in need. In this work, we implement and evaluate the performance of a screening tool that may complement some of the traditional methods’ weaknesses, by signalling subjects at risk of developing mental illnesses, that could benefit from receiving medical assistance. This tool is based on machine learning, and it detects individuals at risk using their publicly available data from the social network Reddit. This work was based on our participation in tasks 1 and 2 of the 2022 edition of CLEF eRisk, with the goal of detecting subjects at risk of pathological gambling and depression respectively, where we had a special focus on the use and comparison of different text vectorization methods. Despite the fact that the initial results obtained at the event were far from those desired, with some tweaks and additional experiments, we managed to improve them, achieving final F1-scores of 0.886 and 0.653 for the best models of tasks 1 and 2 respectively.
publishDate	2022
dc.date.none.fl_str_mv	2022-12-12T00:00:00Z 2022-12-12 2023-07-25T10:30:53Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/masterThesis
format	masterThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/10773/38990
url	http://hdl.handle.net/10773/38990
dc.language.iso.fl_str_mv	eng
language	eng
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.source.none.fl_str_mv	reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP
instname_str	Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_	1799137743483174912

Social mining for the classification of mental illnesses in public forums

Registros relacionados