Using Dysphonic Voice to Characterize Speaker’s Biometry

Detalhes bibliográficos
Autor(a) principal: Gómez, Pedro
Data de Publicação: 2017
Outros Autores: San Segundo, Eugenia, Mazaira, Luis M., Álvarez, Augustin, Rodellar, Victoria
Tipo de documento: Artigo
Idioma: por
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: https://ojs.letras.up.pt/index.php/LLLD/article/view/2431
Resumo: Phonation distortion leaves relevant marks in a speaker’s biometric profile. Dysphonic voice production may be used for biometrical speaker characterization.  In the present paper phonation features derived from the glottal source (GS) parameterization, after vocal tract inversion, is proposed for dysphonic voice characterization in Speaker Verification tasks. The glottal source derived parameters are matched in a forensic evaluation framework defining a distance-based metric specification. The phonation segments used in the study are derived from fillers, long vowels, and other phonation segments produced in spontaneous telephone conversations. Phonated segments from a telephonic database of 100 male Spanish native speakers are combined in a 10-fold cross-validation task to produce the set of quality measuremen ts outlined in the paper. Shimmer, mucosal wave correlate, vocal fold cover biomechanical parameter unbalance and a subset of the GS cepstral profile produce accuracy rates as high as 99.57 for a wide threshold interval (62.08-75.04%). An Equal Error Rate of 0.64 % can be granted. The proposed metric framework is shown to behave more fairly than classical likelihood ratios in supporting the hypothesis of the defense vs that of the prosecution, thus offering a more reliable evaluation scoring. Possible applications are Speaker Verification and Dysphonic Voice Grading.
id RCAP_674132a73b329c8d28c6501d78c7f4a1
oai_identifier_str oai:ojs.pkp.sfu.ca:article/2431
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Using Dysphonic Voice to Characterize Speaker’s BiometryArtigos/ArticlesPhonation distortion leaves relevant marks in a speaker’s biometric profile. Dysphonic voice production may be used for biometrical speaker characterization.  In the present paper phonation features derived from the glottal source (GS) parameterization, after vocal tract inversion, is proposed for dysphonic voice characterization in Speaker Verification tasks. The glottal source derived parameters are matched in a forensic evaluation framework defining a distance-based metric specification. The phonation segments used in the study are derived from fillers, long vowels, and other phonation segments produced in spontaneous telephone conversations. Phonated segments from a telephonic database of 100 male Spanish native speakers are combined in a 10-fold cross-validation task to produce the set of quality measuremen ts outlined in the paper. Shimmer, mucosal wave correlate, vocal fold cover biomechanical parameter unbalance and a subset of the GS cepstral profile produce accuracy rates as high as 99.57 for a wide threshold interval (62.08-75.04%). An Equal Error Rate of 0.64 % can be granted. The proposed metric framework is shown to behave more fairly than classical likelihood ratios in supporting the hypothesis of the defense vs that of the prosecution, thus offering a more reliable evaluation scoring. Possible applications are Speaker Verification and Dysphonic Voice Grading.Faculdade de Letras da Universidade do Porto2017-05-30T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttps://ojs.letras.up.pt/index.php/LLLD/article/view/2431por2183-3745Gómez, PedroSan Segundo, EugeniaMazaira, Luis M.Álvarez, AugustinRodellar, Victoriainfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2022-09-21T15:48:17Zoai:ojs.pkp.sfu.ca:article/2431Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T15:56:36.285135Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Using Dysphonic Voice to Characterize Speaker’s Biometry
title Using Dysphonic Voice to Characterize Speaker’s Biometry
spellingShingle Using Dysphonic Voice to Characterize Speaker’s Biometry
Gómez, Pedro
Artigos/Articles
title_short Using Dysphonic Voice to Characterize Speaker’s Biometry
title_full Using Dysphonic Voice to Characterize Speaker’s Biometry
title_fullStr Using Dysphonic Voice to Characterize Speaker’s Biometry
title_full_unstemmed Using Dysphonic Voice to Characterize Speaker’s Biometry
title_sort Using Dysphonic Voice to Characterize Speaker’s Biometry
author Gómez, Pedro
author_facet Gómez, Pedro
San Segundo, Eugenia
Mazaira, Luis M.
Álvarez, Augustin
Rodellar, Victoria
author_role author
author2 San Segundo, Eugenia
Mazaira, Luis M.
Álvarez, Augustin
Rodellar, Victoria
author2_role author
author
author
author
dc.contributor.author.fl_str_mv Gómez, Pedro
San Segundo, Eugenia
Mazaira, Luis M.
Álvarez, Augustin
Rodellar, Victoria
dc.subject.por.fl_str_mv Artigos/Articles
topic Artigos/Articles
description Phonation distortion leaves relevant marks in a speaker’s biometric profile. Dysphonic voice production may be used for biometrical speaker characterization.  In the present paper phonation features derived from the glottal source (GS) parameterization, after vocal tract inversion, is proposed for dysphonic voice characterization in Speaker Verification tasks. The glottal source derived parameters are matched in a forensic evaluation framework defining a distance-based metric specification. The phonation segments used in the study are derived from fillers, long vowels, and other phonation segments produced in spontaneous telephone conversations. Phonated segments from a telephonic database of 100 male Spanish native speakers are combined in a 10-fold cross-validation task to produce the set of quality measuremen ts outlined in the paper. Shimmer, mucosal wave correlate, vocal fold cover biomechanical parameter unbalance and a subset of the GS cepstral profile produce accuracy rates as high as 99.57 for a wide threshold interval (62.08-75.04%). An Equal Error Rate of 0.64 % can be granted. The proposed metric framework is shown to behave more fairly than classical likelihood ratios in supporting the hypothesis of the defense vs that of the prosecution, thus offering a more reliable evaluation scoring. Possible applications are Speaker Verification and Dysphonic Voice Grading.
publishDate 2017
dc.date.none.fl_str_mv 2017-05-30T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://ojs.letras.up.pt/index.php/LLLD/article/view/2431
url https://ojs.letras.up.pt/index.php/LLLD/article/view/2431
dc.language.iso.fl_str_mv por
language por
dc.relation.none.fl_str_mv 2183-3745
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv Faculdade de Letras da Universidade do Porto
publisher.none.fl_str_mv Faculdade de Letras da Universidade do Porto
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799130434145091584