Deep Learning for Automatic Segmentation and Classification of Adventitious Respiratory Sounds

Fernandes, Tiago Ferreira

Deep Learning for Automatic Segmentation and Classification of Adventitious Respiratory Sounds

Detalhes bibliográficos
Autor(a) principal:	Fernandes, Tiago Ferreira
Data de Publicação:	2022
Tipo de documento:	Dissertação
Idioma:	eng
Título da fonte:	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo:	http://hdl.handle.net/10316/102160
Resumo:	Dissertação de Mestrado em Engenharia e Ciência de Dados apresentada à Faculdade de Ciências e Tecnologia

Metadados do item

id	RCAP_999b59c455188f2fc424fe6cbb356a4d
oai_identifier_str	oai:estudogeral.uc.pt:10316/102160
network_acronym_str	RCAP
network_name_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str	7160
spelling	Deep Learning for Automatic Segmentation and Classification of Adventitious Respiratory SoundsDeep Learning para Segmentação e Classificação Automática de Sons Respiratórios AdventíciosAprendizagem ProfundaAprendizagem ComputacionalSons Respiratórios AdventíciosClassificaçãoSegmentaçãoDeep LearningMachine LearningAdventitious Respiratory SoundsClassificationSegmentationDissertação de Mestrado em Engenharia e Ciência de Dados apresentada à Faculdade de Ciências e TecnologiaAs patologias do foro respiratório são das mais mortíferas causas de morte em todo o mundo. Estas patologias são caracterizadas pela existência de Sons Respiratórios Adventícios, como as sibilâncias e fervores, ao longo do ciclo respiratório.Nesta tese, é realizado um estudo sobre a aplicabilidade de abordagens de Aprendizagem Profunda para classificar e segmentar estes Sons Respiratórios Normais e Adventícios presentes nos ciclos respiratórios dos pacientes, especialmente as sibilâncias e os fervores. Como os modelos de Aprendizagem Profunda necessitam de bases de dados grandes e variadas, três bases de dados foram usadas: Respiratory Sound Database (RSD), uma variação da RSD que não é pública (RSD New Annotations) e a base de dados HF_Lung_V1. Vários modelos de Aprendizagem Profunda como as Convolutional Neural Network (CNN), Bidirectional Long Short-Term Memory (BiLSTM) e ainda uma combinação de ambos (CNN-BiLSTM) foram desenvolvidos, assim como modelos de Aprendizagem Computacional clássicos como Linear Discriminant Analysis (LDA), Support Vector Machine with radial basis function (SVMrbf) e Random Undersampling Boosted Trees (RUSBoost), para avaliar e comparar diferentes modelos de Aprendizagem Computacional.Durante a fase da classificação, os modelos de Aprendizagem Computacional clássicos descritos acima apenas foram testados e desenvolvidos para poder comparar com os modelos de Aprendizagem Profunda (CNNs) e para iniciar a adaptação a trabalhar com este tipo de dados. Estes modelos foram replicados de um trabalho prévio desenvolvido pela equipa com as três bases de dados acima mencionados (F1-Score macro de79.1% na RSD, F1-Score macro de 68.8% na RSD New Annotations, e F1-Score macro de 65.2% na base de dados HF_Lung_V1), assim como o cruzamentos entre elas para compreender a capacidade destes modelos de generalizar, concluindo-se que não são bons nessa tarefa, dada a diferença nas anotações dos eventos nas base de dados (F1-Score macro de 39.5% treinado na RSD e testado em HF_Lung_V1, and F1-Score macro de 38.8% treinado em HF_Lung_V1 e testado na RSD - problema a 3 classes). Para além disso, uma estratificação da RSD usando os mesmos modelos foi feita, para uma melhor compreensão de qual categoria demográfica e equipamento usado para gravar consegue melhores resultados (F1-Score macro de 81.8% com o microfone AKGC417L, F1-Score macro de 78.9% nos adultos, F1-Score macro de 79.6% nos homens, F1-Score macro de 83.3% nos pacientes com índice de massa corporal (IMC) normal, e F1-Score macro de 85.3% nos pacientes com doenças não crónicas).Já durante a fase de segmentação, duas abordagens foram desenvolvidas: a primeira foram duas CNN para classificar frames individualmente e foi usado como base para poder comparar com a segunda abordagem; e a segunda abordagem foi a replicação de um dos modelos de um artigo citado, a CNN-BiLSTM, que conseguiu melhores resultados que a primeira abordagem na RSD (F1-Score de 26.8% vs. F1-Score de 22.3% nos fervores vs. sons normais, e F1-Score de 26.5% vs. F1-Score de 41.0% nas sibilâncias vs. sons normais) e na HF_Lung_V1 (F1-Score de 35.9% vs. F1-Score de 41.5% nos fervores vs. sons normais, e F1-Score de 26.0% vs. F1-Score de 42.1% nas sibilâncias vs. sons normais). O cruzamento entre estas bases de dados também foi feito para compreender a capacidade de generalizar destes modelos, concluindo-se que não são bons nessa tarefa, dada a diferença nas anotações dos eventos nas base de dados. Para concluir, uma pequena estratificação da RSD apenas com os ficheiros que usaram um microfone AKGC417L também foi feita usando este último modelo (F1-Score de 14.7% nos fervores vs. sons normais, e F1-Score de 41.6% nas sibilâncias vs. sons normais).As soluções propostas para a classificação e segmentação permitiram avançar relativamente ao estado-de-arte sobre este problema, especialmente utilizando a RSD, embora as abordagens atuais ainda necessitem de ser melhoradas para permitir a sua utilização em cenários reais.Respiratory diseases are among the deadliest in the world. These pathologies are characterised by Adventitious Respiratory Sounds (ARS), such as wheezes and crackles, throughout the respiratory cycle.In this thesis, a study was conducted regarding the applicability of Deep Learning (DL) with the aim of automatically classifying and segmenting these adventitious and normal respiratory sounds present in patients' respiratory breathing cycles, mostly wheezes and crackles. Since DL models require large and diverse datasets, three datasets were used: Respiratory Sound Database (RSD), a variation of the RSD not publicly available (RSD New Annotations), and the HF_Lung_V1 dataset. Several DL architectures such as Convolutional Neural Network (CNN), Bidirectional Long Short-Term Memory (BiLSTM), and a combination of both (CNN-BiLSTM) are evaluated, as well as classical Machine Learning (ML) such as Linear Discriminant Analysis (LDA), Support Vector Machine with radial basis function (SVMrbf), and Random Undersampling Boosted Trees (RUSBoost), in order to evaluate and compare different ML approaches.In the classification phase, the classical ML approaches described above served as baseline models to compare against the DL models (CNNs) and to start the adaptation of this kind of data. These models were replicated from previous work by our team with the three datasets mentioned above (F1-Score macro of 79.1% in the RSD, F1-Score macro of 68.8% in the RSD New Annotations, and F1-Score macro of 65.2% in the HF_Lung_V1), as well as a crossing between them to better understand the capability of these models to generalise, which proved not to be very successful given their differences in the annotations of the datasets (F1-Score macro of 39.5% trained with RSD and tested with HF_Lung_V1, and F1-Score macro of 38.8% trained with the HF_Lung_V1 and tested with RSD - 3-class problem). Also, stratification of the RSD using the same models was performed, in order to better understand which demographic category and recording device achieved better results (F1-Score macro of 81.8% with the AKGC417L microphone, F1-Score macro of 78.9% in Adults, F1-Score macro of 79.6% in Male subjects, F1-Score macro of 83.3% subjects with Normal body-mass index, and F1-Score macro of 85.3% in subjects with Non-Chronic diagnosis).As for the segmentation phase, two approaches were developed: the first consisted on two CNNs to classify individual frames and, it was used as a baseline to compare with the second approach; and the second approach was a replication of one of the models from a cited article, the CNN-BiLSTM, which achieved better results than the first approach in RSD (F1-Score of 26.8% vs. F1-Score of 22.3% in crackles vs. normal sounds, and F1-Score of 26.5% vs. F1-Score of 41.0% in wheezes vs. normal sounds) and HF_Lung_V1 (F1-Score of 35.9% vs. F1-Score of 41.5% in crackles vs. normal sounds, and F1-Score of 26.0% vs. F1-Score of 42.1% in wheezes vs. normal sounds). A crossing between both datasets was performed to check the capability of these models to generalise, which proved to be not very successful given their differences in the annotations of the datasets. Also, a small stratification of the RSD with this last model was performed only using the recordings of the AKGC417L microphone (F1-Score of 14.7% in crackles vs. normal sounds, and F1-Score of 41.6% in wheezes vs. normal sounds).The proposed solutions for classification and segmentation allowed to advance to state-of-the-art on this problem, specially using the RSD, although the current approaches still need to be improved to permit its accurate use on real-world scenarios.H20202022-09-08info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesishttp://hdl.handle.net/10316/102160http://hdl.handle.net/10316/102160TID:203062698engFernandes, Tiago Ferreirainfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2022-09-27T20:41:51Zoai:estudogeral.uc.pt:10316/102160Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T21:19:12.877326Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv	Deep Learning for Automatic Segmentation and Classification of Adventitious Respiratory Sounds Deep Learning para Segmentação e Classificação Automática de Sons Respiratórios Adventícios
title	Deep Learning for Automatic Segmentation and Classification of Adventitious Respiratory Sounds
spellingShingle	Deep Learning for Automatic Segmentation and Classification of Adventitious Respiratory Sounds Fernandes, Tiago Ferreira Aprendizagem Profunda Aprendizagem Computacional Sons Respiratórios Adventícios Classificação Segmentação Deep Learning Machine Learning Adventitious Respiratory Sounds Classification Segmentation
title_short	Deep Learning for Automatic Segmentation and Classification of Adventitious Respiratory Sounds
title_full	Deep Learning for Automatic Segmentation and Classification of Adventitious Respiratory Sounds
title_fullStr	Deep Learning for Automatic Segmentation and Classification of Adventitious Respiratory Sounds
title_full_unstemmed	Deep Learning for Automatic Segmentation and Classification of Adventitious Respiratory Sounds
title_sort	Deep Learning for Automatic Segmentation and Classification of Adventitious Respiratory Sounds
author	Fernandes, Tiago Ferreira
author_facet	Fernandes, Tiago Ferreira
author_role	author
dc.contributor.author.fl_str_mv	Fernandes, Tiago Ferreira
dc.subject.por.fl_str_mv	Aprendizagem Profunda Aprendizagem Computacional Sons Respiratórios Adventícios Classificação Segmentação Deep Learning Machine Learning Adventitious Respiratory Sounds Classification Segmentation
topic	Aprendizagem Profunda Aprendizagem Computacional Sons Respiratórios Adventícios Classificação Segmentação Deep Learning Machine Learning Adventitious Respiratory Sounds Classification Segmentation
description	Dissertação de Mestrado em Engenharia e Ciência de Dados apresentada à Faculdade de Ciências e Tecnologia
publishDate	2022
dc.date.none.fl_str_mv	2022-09-08
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/masterThesis
format	masterThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/10316/102160 http://hdl.handle.net/10316/102160 TID:203062698
url	http://hdl.handle.net/10316/102160
identifier_str_mv	TID:203062698
dc.language.iso.fl_str_mv	eng
language	eng
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.source.none.fl_str_mv	reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP
instname_str	Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_	1799134086338445312

Deep Learning for Automatic Segmentation and Classification of Adventitious Respiratory Sounds

Registros relacionados