Effective speaker retrieval and recognition through vector quantization and unsupervised distance learning
Autor(a) principal: | |
---|---|
Data de Publicação: | 2016 |
Outros Autores: | |
Tipo de documento: | Artigo de conferência |
Idioma: | eng |
Título da fonte: | Repositório Institucional da UNESP |
Texto Completo: | http://dx.doi.org/10.1145/2927006.2927010 http://hdl.handle.net/11449/168820 |
Resumo: | The huge amount of multimedia content accumulated daily has demanded the development of effective retrieval approaches. In this context, speaker recognition methods capable of automatically identifying a person through their voice is of great relevance. This paper presents a novel speaker recognition approach modelled in a retrieval scenario and using a recent unsupervised learning method. The proposed approach considers MFCC features and a Vector Quantization model to compute distances among audio objects. Next, a rank-based unsupervised learning method is used for improving the effectiveness of retrieval results. Several experiments were conducted considering three public datasets with different settings, such as background noise from diverse sources. Experimental results demonstrate that the proposed approach can achieve very high effectiveness results. In addition, effectiveness gains up to +27% were obtained by the unsupervised learning procedure. |
id |
UNSP_6df09fce8c2f4f6c887e72353a07b6c1 |
---|---|
oai_identifier_str |
oai:repositorio.unesp.br:11449/168820 |
network_acronym_str |
UNSP |
network_name_str |
Repositório Institucional da UNESP |
repository_id_str |
2946 |
spelling |
Effective speaker retrieval and recognition through vector quantization and unsupervised distance learningSpeaker recognitionUnsupervised learningVector quantizationThe huge amount of multimedia content accumulated daily has demanded the development of effective retrieval approaches. In this context, speaker recognition methods capable of automatically identifying a person through their voice is of great relevance. This paper presents a novel speaker recognition approach modelled in a retrieval scenario and using a recent unsupervised learning method. The proposed approach considers MFCC features and a Vector Quantization model to compute distances among audio objects. Next, a rank-based unsupervised learning method is used for improving the effectiveness of retrieval results. Several experiments were conducted considering three public datasets with different settings, such as background noise from diverse sources. Experimental results demonstrate that the proposed approach can achieve very high effectiveness results. In addition, effectiveness gains up to +27% were obtained by the unsupervised learning procedure.Dept. of Statistic Applied Math. and Computing Universidade Estadual Paulista (UNESP)Dept. of Statistic Applied Math. and Computing Universidade Estadual Paulista (UNESP)Universidade Estadual Paulista (Unesp)De Abreu Campos, Victor [UNESP]Guimarães Pedronette, Daniel Carlos [UNESP]2018-12-11T16:43:13Z2018-12-11T16:43:13Z2016-06-06info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/conferenceObject27-32http://dx.doi.org/10.1145/2927006.2927010MARMI 2016 - Proceedings of the 2016 ACM 1st International Workshop on Multimedia Analysis and Retrieval for Multimodal Interaction, co-located with ICMR 2016, p. 27-32.http://hdl.handle.net/11449/16882010.1145/2927006.29270102-s2.0-84978747065Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengMARMI 2016 - Proceedings of the 2016 ACM 1st International Workshop on Multimedia Analysis and Retrieval for Multimodal Interaction, co-located with ICMR 2016info:eu-repo/semantics/openAccess2021-10-23T21:47:04Zoai:repositorio.unesp.br:11449/168820Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462024-08-05T13:44:02.677605Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false |
dc.title.none.fl_str_mv |
Effective speaker retrieval and recognition through vector quantization and unsupervised distance learning |
title |
Effective speaker retrieval and recognition through vector quantization and unsupervised distance learning |
spellingShingle |
Effective speaker retrieval and recognition through vector quantization and unsupervised distance learning De Abreu Campos, Victor [UNESP] Speaker recognition Unsupervised learning Vector quantization |
title_short |
Effective speaker retrieval and recognition through vector quantization and unsupervised distance learning |
title_full |
Effective speaker retrieval and recognition through vector quantization and unsupervised distance learning |
title_fullStr |
Effective speaker retrieval and recognition through vector quantization and unsupervised distance learning |
title_full_unstemmed |
Effective speaker retrieval and recognition through vector quantization and unsupervised distance learning |
title_sort |
Effective speaker retrieval and recognition through vector quantization and unsupervised distance learning |
author |
De Abreu Campos, Victor [UNESP] |
author_facet |
De Abreu Campos, Victor [UNESP] Guimarães Pedronette, Daniel Carlos [UNESP] |
author_role |
author |
author2 |
Guimarães Pedronette, Daniel Carlos [UNESP] |
author2_role |
author |
dc.contributor.none.fl_str_mv |
Universidade Estadual Paulista (Unesp) |
dc.contributor.author.fl_str_mv |
De Abreu Campos, Victor [UNESP] Guimarães Pedronette, Daniel Carlos [UNESP] |
dc.subject.por.fl_str_mv |
Speaker recognition Unsupervised learning Vector quantization |
topic |
Speaker recognition Unsupervised learning Vector quantization |
description |
The huge amount of multimedia content accumulated daily has demanded the development of effective retrieval approaches. In this context, speaker recognition methods capable of automatically identifying a person through their voice is of great relevance. This paper presents a novel speaker recognition approach modelled in a retrieval scenario and using a recent unsupervised learning method. The proposed approach considers MFCC features and a Vector Quantization model to compute distances among audio objects. Next, a rank-based unsupervised learning method is used for improving the effectiveness of retrieval results. Several experiments were conducted considering three public datasets with different settings, such as background noise from diverse sources. Experimental results demonstrate that the proposed approach can achieve very high effectiveness results. In addition, effectiveness gains up to +27% were obtained by the unsupervised learning procedure. |
publishDate |
2016 |
dc.date.none.fl_str_mv |
2016-06-06 2018-12-11T16:43:13Z 2018-12-11T16:43:13Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/conferenceObject |
format |
conferenceObject |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://dx.doi.org/10.1145/2927006.2927010 MARMI 2016 - Proceedings of the 2016 ACM 1st International Workshop on Multimedia Analysis and Retrieval for Multimodal Interaction, co-located with ICMR 2016, p. 27-32. http://hdl.handle.net/11449/168820 10.1145/2927006.2927010 2-s2.0-84978747065 |
url |
http://dx.doi.org/10.1145/2927006.2927010 http://hdl.handle.net/11449/168820 |
identifier_str_mv |
MARMI 2016 - Proceedings of the 2016 ACM 1st International Workshop on Multimedia Analysis and Retrieval for Multimodal Interaction, co-located with ICMR 2016, p. 27-32. 10.1145/2927006.2927010 2-s2.0-84978747065 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
MARMI 2016 - Proceedings of the 2016 ACM 1st International Workshop on Multimedia Analysis and Retrieval for Multimodal Interaction, co-located with ICMR 2016 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
27-32 |
dc.source.none.fl_str_mv |
Scopus reponame:Repositório Institucional da UNESP instname:Universidade Estadual Paulista (UNESP) instacron:UNESP |
instname_str |
Universidade Estadual Paulista (UNESP) |
instacron_str |
UNESP |
institution |
UNESP |
reponame_str |
Repositório Institucional da UNESP |
collection |
Repositório Institucional da UNESP |
repository.name.fl_str_mv |
Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP) |
repository.mail.fl_str_mv |
|
_version_ |
1808128269992591360 |