Smart Data Driven System for Pathological Voices Classification

Detalhes bibliográficos
Autor(a) principal: Fernandes, Joana
Data de Publicação: 2022
Outros Autores: Junior, Arnaldo Candido [UNESP], Freitas, Diamantino, Teixeira, João Paulo
Tipo de documento: Artigo de conferência
Idioma: eng
Título da fonte: Repositório Institucional da UNESP
Texto Completo: http://dx.doi.org/10.1007/978-3-031-23236-7_29
http://hdl.handle.net/11449/246824
Resumo: Classifying and recognizing voice pathologies non-invasively using acoustic analysis saves patient and specialist time and can improve the accuracy of assessments. In this work, we intend to understand which models provide better accuracy rates in the distinction between healthy and pathological, to later be implemented in a system for the detection of vocal pathologies. 194 control subjects and 350 pathological subjects distributed across 17 pathologies were used. Each subject has 3 vowels in 3 tones, which is equivalent to 9 sound files per subject. For each sound file, 13 parameters were extracted (jitta, jitter, Rap, PPQ5, ShdB, Shim, APQ3, APQ5, F0, HNR, autocorrelation, Shannon entropy and logarithmic entropy). For the classification between healthy and pathological, several classifiers were used (Decision Trees, Discriminant Analysis, Logistic Regression Classifiers, Naive Bayes Classifiers, Support Vector Machines, Nearest Neighbor Classifiers, Ensemble Classifiers, Neural Network Classifiers) with various models. For each patient, 118 parameters were used (13 acoustic parameters * 9 sound files per subject, plus the subject’s gender). As pre-processing of the input matrix data, the Outliers treatment was used using the quartile method, then the data were normalized and, finally, Principal Component Analysis (PCA) was applied in order to reduce the dimension. As the best model, the Wide Neural Network was obtained, with an accuracy of 98% and AUC of 0.99.
id UNSP_b172a5af14320bb73d428250f811f62c
oai_identifier_str oai:repositorio.unesp.br:11449/246824
network_acronym_str UNSP
network_name_str Repositório Institucional da UNESP
repository_id_str 2946
spelling Smart Data Driven System for Pathological Voices ClassificationMachine learningPrincipal component analysisSpeech featuresSpeech pathologiesVocal acoustic analysisClassifying and recognizing voice pathologies non-invasively using acoustic analysis saves patient and specialist time and can improve the accuracy of assessments. In this work, we intend to understand which models provide better accuracy rates in the distinction between healthy and pathological, to later be implemented in a system for the detection of vocal pathologies. 194 control subjects and 350 pathological subjects distributed across 17 pathologies were used. Each subject has 3 vowels in 3 tones, which is equivalent to 9 sound files per subject. For each sound file, 13 parameters were extracted (jitta, jitter, Rap, PPQ5, ShdB, Shim, APQ3, APQ5, F0, HNR, autocorrelation, Shannon entropy and logarithmic entropy). For the classification between healthy and pathological, several classifiers were used (Decision Trees, Discriminant Analysis, Logistic Regression Classifiers, Naive Bayes Classifiers, Support Vector Machines, Nearest Neighbor Classifiers, Ensemble Classifiers, Neural Network Classifiers) with various models. For each patient, 118 parameters were used (13 acoustic parameters * 9 sound files per subject, plus the subject’s gender). As pre-processing of the input matrix data, the Outliers treatment was used using the quartile method, then the data were normalized and, finally, Principal Component Analysis (PCA) was applied in order to reduce the dimension. As the best model, the Wide Neural Network was obtained, with an accuracy of 98% and AUC of 0.99.Research Centre in Digitaization and Intelligent Robotics (CeDRI) Instituto Politecnico de Braganca (IPB) Braganca 5300 Portugal Faculdade de Engenharia da Universidade do Porto (FEUP)São Paulo State University Institute of Biosciences Language and Physical Sciences, SPFaculdade de Engenharia da Universidade do Porto (FEUP)Research Centre in Digitaization and Intelligent Robotics (CeDRI) Applied Management Research Unit (UNIAG) and Laboratório para a Sustentabilidade e Tecnologia em Regiões de Montanha (SusTEC) Instituto Politecnico de Braganca (IPB)São Paulo State University Institute of Biosciences Language and Physical Sciences, SPFaculdade de Engenharia da Universidade do Porto (FEUP)Universidade Estadual Paulista (UNESP)Instituto Politecnico de Braganca (IPB)Fernandes, JoanaJunior, Arnaldo Candido [UNESP]Freitas, DiamantinoTeixeira, João Paulo2023-07-29T12:51:32Z2023-07-29T12:51:32Z2022-01-01info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/conferenceObject419-426http://dx.doi.org/10.1007/978-3-031-23236-7_29Communications in Computer and Information Science, v. 1754 CCIS, p. 419-426.1865-09371865-0929http://hdl.handle.net/11449/24682410.1007/978-3-031-23236-7_292-s2.0-85148012253Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengCommunications in Computer and Information Scienceinfo:eu-repo/semantics/openAccess2023-07-29T12:51:32Zoai:repositorio.unesp.br:11449/246824Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462024-08-06T00:00:02.911101Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false
dc.title.none.fl_str_mv Smart Data Driven System for Pathological Voices Classification
title Smart Data Driven System for Pathological Voices Classification
spellingShingle Smart Data Driven System for Pathological Voices Classification
Fernandes, Joana
Machine learning
Principal component analysis
Speech features
Speech pathologies
Vocal acoustic analysis
title_short Smart Data Driven System for Pathological Voices Classification
title_full Smart Data Driven System for Pathological Voices Classification
title_fullStr Smart Data Driven System for Pathological Voices Classification
title_full_unstemmed Smart Data Driven System for Pathological Voices Classification
title_sort Smart Data Driven System for Pathological Voices Classification
author Fernandes, Joana
author_facet Fernandes, Joana
Junior, Arnaldo Candido [UNESP]
Freitas, Diamantino
Teixeira, João Paulo
author_role author
author2 Junior, Arnaldo Candido [UNESP]
Freitas, Diamantino
Teixeira, João Paulo
author2_role author
author
author
dc.contributor.none.fl_str_mv Faculdade de Engenharia da Universidade do Porto (FEUP)
Universidade Estadual Paulista (UNESP)
Instituto Politecnico de Braganca (IPB)
dc.contributor.author.fl_str_mv Fernandes, Joana
Junior, Arnaldo Candido [UNESP]
Freitas, Diamantino
Teixeira, João Paulo
dc.subject.por.fl_str_mv Machine learning
Principal component analysis
Speech features
Speech pathologies
Vocal acoustic analysis
topic Machine learning
Principal component analysis
Speech features
Speech pathologies
Vocal acoustic analysis
description Classifying and recognizing voice pathologies non-invasively using acoustic analysis saves patient and specialist time and can improve the accuracy of assessments. In this work, we intend to understand which models provide better accuracy rates in the distinction between healthy and pathological, to later be implemented in a system for the detection of vocal pathologies. 194 control subjects and 350 pathological subjects distributed across 17 pathologies were used. Each subject has 3 vowels in 3 tones, which is equivalent to 9 sound files per subject. For each sound file, 13 parameters were extracted (jitta, jitter, Rap, PPQ5, ShdB, Shim, APQ3, APQ5, F0, HNR, autocorrelation, Shannon entropy and logarithmic entropy). For the classification between healthy and pathological, several classifiers were used (Decision Trees, Discriminant Analysis, Logistic Regression Classifiers, Naive Bayes Classifiers, Support Vector Machines, Nearest Neighbor Classifiers, Ensemble Classifiers, Neural Network Classifiers) with various models. For each patient, 118 parameters were used (13 acoustic parameters * 9 sound files per subject, plus the subject’s gender). As pre-processing of the input matrix data, the Outliers treatment was used using the quartile method, then the data were normalized and, finally, Principal Component Analysis (PCA) was applied in order to reduce the dimension. As the best model, the Wide Neural Network was obtained, with an accuracy of 98% and AUC of 0.99.
publishDate 2022
dc.date.none.fl_str_mv 2022-01-01
2023-07-29T12:51:32Z
2023-07-29T12:51:32Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/conferenceObject
format conferenceObject
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://dx.doi.org/10.1007/978-3-031-23236-7_29
Communications in Computer and Information Science, v. 1754 CCIS, p. 419-426.
1865-0937
1865-0929
http://hdl.handle.net/11449/246824
10.1007/978-3-031-23236-7_29
2-s2.0-85148012253
url http://dx.doi.org/10.1007/978-3-031-23236-7_29
http://hdl.handle.net/11449/246824
identifier_str_mv Communications in Computer and Information Science, v. 1754 CCIS, p. 419-426.
1865-0937
1865-0929
10.1007/978-3-031-23236-7_29
2-s2.0-85148012253
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv Communications in Computer and Information Science
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv 419-426
dc.source.none.fl_str_mv Scopus
reponame:Repositório Institucional da UNESP
instname:Universidade Estadual Paulista (UNESP)
instacron:UNESP
instname_str Universidade Estadual Paulista (UNESP)
instacron_str UNESP
institution UNESP
reponame_str Repositório Institucional da UNESP
collection Repositório Institucional da UNESP
repository.name.fl_str_mv Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)
repository.mail.fl_str_mv
_version_ 1808129570929377280