Smart Data Driven System for Pathological Voices Classification
Autor(a) principal: | |
---|---|
Data de Publicação: | 2022 |
Outros Autores: | , , |
Tipo de documento: | Artigo de conferência |
Idioma: | eng |
Título da fonte: | Repositório Institucional da UNESP |
Texto Completo: | http://dx.doi.org/10.1007/978-3-031-23236-7_29 http://hdl.handle.net/11449/246824 |
Resumo: | Classifying and recognizing voice pathologies non-invasively using acoustic analysis saves patient and specialist time and can improve the accuracy of assessments. In this work, we intend to understand which models provide better accuracy rates in the distinction between healthy and pathological, to later be implemented in a system for the detection of vocal pathologies. 194 control subjects and 350 pathological subjects distributed across 17 pathologies were used. Each subject has 3 vowels in 3 tones, which is equivalent to 9 sound files per subject. For each sound file, 13 parameters were extracted (jitta, jitter, Rap, PPQ5, ShdB, Shim, APQ3, APQ5, F0, HNR, autocorrelation, Shannon entropy and logarithmic entropy). For the classification between healthy and pathological, several classifiers were used (Decision Trees, Discriminant Analysis, Logistic Regression Classifiers, Naive Bayes Classifiers, Support Vector Machines, Nearest Neighbor Classifiers, Ensemble Classifiers, Neural Network Classifiers) with various models. For each patient, 118 parameters were used (13 acoustic parameters * 9 sound files per subject, plus the subject’s gender). As pre-processing of the input matrix data, the Outliers treatment was used using the quartile method, then the data were normalized and, finally, Principal Component Analysis (PCA) was applied in order to reduce the dimension. As the best model, the Wide Neural Network was obtained, with an accuracy of 98% and AUC of 0.99. |
id |
UNSP_b172a5af14320bb73d428250f811f62c |
---|---|
oai_identifier_str |
oai:repositorio.unesp.br:11449/246824 |
network_acronym_str |
UNSP |
network_name_str |
Repositório Institucional da UNESP |
repository_id_str |
2946 |
spelling |
Smart Data Driven System for Pathological Voices ClassificationMachine learningPrincipal component analysisSpeech featuresSpeech pathologiesVocal acoustic analysisClassifying and recognizing voice pathologies non-invasively using acoustic analysis saves patient and specialist time and can improve the accuracy of assessments. In this work, we intend to understand which models provide better accuracy rates in the distinction between healthy and pathological, to later be implemented in a system for the detection of vocal pathologies. 194 control subjects and 350 pathological subjects distributed across 17 pathologies were used. Each subject has 3 vowels in 3 tones, which is equivalent to 9 sound files per subject. For each sound file, 13 parameters were extracted (jitta, jitter, Rap, PPQ5, ShdB, Shim, APQ3, APQ5, F0, HNR, autocorrelation, Shannon entropy and logarithmic entropy). For the classification between healthy and pathological, several classifiers were used (Decision Trees, Discriminant Analysis, Logistic Regression Classifiers, Naive Bayes Classifiers, Support Vector Machines, Nearest Neighbor Classifiers, Ensemble Classifiers, Neural Network Classifiers) with various models. For each patient, 118 parameters were used (13 acoustic parameters * 9 sound files per subject, plus the subject’s gender). As pre-processing of the input matrix data, the Outliers treatment was used using the quartile method, then the data were normalized and, finally, Principal Component Analysis (PCA) was applied in order to reduce the dimension. As the best model, the Wide Neural Network was obtained, with an accuracy of 98% and AUC of 0.99.Research Centre in Digitaization and Intelligent Robotics (CeDRI) Instituto Politecnico de Braganca (IPB) Braganca 5300 Portugal Faculdade de Engenharia da Universidade do Porto (FEUP)São Paulo State University Institute of Biosciences Language and Physical Sciences, SPFaculdade de Engenharia da Universidade do Porto (FEUP)Research Centre in Digitaization and Intelligent Robotics (CeDRI) Applied Management Research Unit (UNIAG) and Laboratório para a Sustentabilidade e Tecnologia em Regiões de Montanha (SusTEC) Instituto Politecnico de Braganca (IPB)São Paulo State University Institute of Biosciences Language and Physical Sciences, SPFaculdade de Engenharia da Universidade do Porto (FEUP)Universidade Estadual Paulista (UNESP)Instituto Politecnico de Braganca (IPB)Fernandes, JoanaJunior, Arnaldo Candido [UNESP]Freitas, DiamantinoTeixeira, João Paulo2023-07-29T12:51:32Z2023-07-29T12:51:32Z2022-01-01info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/conferenceObject419-426http://dx.doi.org/10.1007/978-3-031-23236-7_29Communications in Computer and Information Science, v. 1754 CCIS, p. 419-426.1865-09371865-0929http://hdl.handle.net/11449/24682410.1007/978-3-031-23236-7_292-s2.0-85148012253Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengCommunications in Computer and Information Scienceinfo:eu-repo/semantics/openAccess2023-07-29T12:51:32Zoai:repositorio.unesp.br:11449/246824Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462024-08-06T00:00:02.911101Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false |
dc.title.none.fl_str_mv |
Smart Data Driven System for Pathological Voices Classification |
title |
Smart Data Driven System for Pathological Voices Classification |
spellingShingle |
Smart Data Driven System for Pathological Voices Classification Fernandes, Joana Machine learning Principal component analysis Speech features Speech pathologies Vocal acoustic analysis |
title_short |
Smart Data Driven System for Pathological Voices Classification |
title_full |
Smart Data Driven System for Pathological Voices Classification |
title_fullStr |
Smart Data Driven System for Pathological Voices Classification |
title_full_unstemmed |
Smart Data Driven System for Pathological Voices Classification |
title_sort |
Smart Data Driven System for Pathological Voices Classification |
author |
Fernandes, Joana |
author_facet |
Fernandes, Joana Junior, Arnaldo Candido [UNESP] Freitas, Diamantino Teixeira, João Paulo |
author_role |
author |
author2 |
Junior, Arnaldo Candido [UNESP] Freitas, Diamantino Teixeira, João Paulo |
author2_role |
author author author |
dc.contributor.none.fl_str_mv |
Faculdade de Engenharia da Universidade do Porto (FEUP) Universidade Estadual Paulista (UNESP) Instituto Politecnico de Braganca (IPB) |
dc.contributor.author.fl_str_mv |
Fernandes, Joana Junior, Arnaldo Candido [UNESP] Freitas, Diamantino Teixeira, João Paulo |
dc.subject.por.fl_str_mv |
Machine learning Principal component analysis Speech features Speech pathologies Vocal acoustic analysis |
topic |
Machine learning Principal component analysis Speech features Speech pathologies Vocal acoustic analysis |
description |
Classifying and recognizing voice pathologies non-invasively using acoustic analysis saves patient and specialist time and can improve the accuracy of assessments. In this work, we intend to understand which models provide better accuracy rates in the distinction between healthy and pathological, to later be implemented in a system for the detection of vocal pathologies. 194 control subjects and 350 pathological subjects distributed across 17 pathologies were used. Each subject has 3 vowels in 3 tones, which is equivalent to 9 sound files per subject. For each sound file, 13 parameters were extracted (jitta, jitter, Rap, PPQ5, ShdB, Shim, APQ3, APQ5, F0, HNR, autocorrelation, Shannon entropy and logarithmic entropy). For the classification between healthy and pathological, several classifiers were used (Decision Trees, Discriminant Analysis, Logistic Regression Classifiers, Naive Bayes Classifiers, Support Vector Machines, Nearest Neighbor Classifiers, Ensemble Classifiers, Neural Network Classifiers) with various models. For each patient, 118 parameters were used (13 acoustic parameters * 9 sound files per subject, plus the subject’s gender). As pre-processing of the input matrix data, the Outliers treatment was used using the quartile method, then the data were normalized and, finally, Principal Component Analysis (PCA) was applied in order to reduce the dimension. As the best model, the Wide Neural Network was obtained, with an accuracy of 98% and AUC of 0.99. |
publishDate |
2022 |
dc.date.none.fl_str_mv |
2022-01-01 2023-07-29T12:51:32Z 2023-07-29T12:51:32Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/conferenceObject |
format |
conferenceObject |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://dx.doi.org/10.1007/978-3-031-23236-7_29 Communications in Computer and Information Science, v. 1754 CCIS, p. 419-426. 1865-0937 1865-0929 http://hdl.handle.net/11449/246824 10.1007/978-3-031-23236-7_29 2-s2.0-85148012253 |
url |
http://dx.doi.org/10.1007/978-3-031-23236-7_29 http://hdl.handle.net/11449/246824 |
identifier_str_mv |
Communications in Computer and Information Science, v. 1754 CCIS, p. 419-426. 1865-0937 1865-0929 10.1007/978-3-031-23236-7_29 2-s2.0-85148012253 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
Communications in Computer and Information Science |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
419-426 |
dc.source.none.fl_str_mv |
Scopus reponame:Repositório Institucional da UNESP instname:Universidade Estadual Paulista (UNESP) instacron:UNESP |
instname_str |
Universidade Estadual Paulista (UNESP) |
instacron_str |
UNESP |
institution |
UNESP |
reponame_str |
Repositório Institucional da UNESP |
collection |
Repositório Institucional da UNESP |
repository.name.fl_str_mv |
Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP) |
repository.mail.fl_str_mv |
|
_version_ |
1808129570929377280 |