An?lise comparativa entre os m?todos HMM e GMM-UBM na busca pelo a-?timo dos locutores crian?as utilizando a t?cnica VTLN

Detalhes bibliográficos
Autor(a) principal: Martins, Ramon Mayor
Data de Publicação: 2014
Tipo de documento: Dissertação
Idioma: por
Título da fonte: Biblioteca Digital de Teses e Dissertações da INATEL
Texto Completo: http://tede.inatel.br:8080/tede/handle/tede/23
Resumo: The aim of this work is to find means to minimize the high error rate found in speech recognition systems which are trained on adult speakers and tested on children speakers. In this regard, we propose the use of the GMM-UBM method as an alternative to the HMM method to find the optimal warping factor (?-optimal) for children speakers when the speaker normalization technique is used. The adopted normalization technique was VTLN, which normalizes the vocal tract of different children speakers through the use of mel filterbank frequency warping. The assessment of this technique also aimed to find the optimal mixture quantity that improves the system performance. Thus, the error rate in the system trained with adults and tested on children was reduced from 4,95% to 1,88% when VTLN was used with ?-optimals found by HMM and to 1,92% when VTLN was used with ?-optimals found by GMM. It was noticed that the application of VTLN technique using ?-optimals found by GMM-UBM method achieved a similar performance to HMM in the experiments. From the experiments it was observed that choosing GMM-UBM method turns to be more suitable due to its implementation simplicity and to the need of lower computational cost, being thus an alternative to HMM in the use of VTLN in Speech Recognition Systems for children speakers.
id INAT_2656bb147a652cef66a26365aad0e28e
oai_identifier_str oai:localhost:tede/23
network_acronym_str INAT
network_name_str Biblioteca Digital de Teses e Dissertações da INATEL
repository_id_str
spelling Ynoguti, Carlos Alberto156.167.778-70http://lattes.cnpq.br/5678667205895840Ynoguti, Carlos Alberto156.167.778-70http://lattes.cnpq.br/5678667205895840Guimar?es, Dayan Adionel739.337.836-15http://lattes.cnpq.br/2503439503631682Minami, M?riohttp://lattes.cnpq.br/5882877274227409052.866.756-46http://lattes.cnpq.br/6289204315531991Martins, Ramon Mayor2016-06-27T18:30:31Z2014-10-09Martins, Ramon Mayor. An?lise comparativa entre os m?todos HMM e GMM-UBM na busca pelo a-?timo dos locutores crian?as utilizando a t?cnica VTLN. 2014. [60]. Disserta????o( Programa 1) - Instituto Nacional de Telecomunicacoes, [Santa Rita do Sapuca?] .http://tede.inatel.br:8080/tede/handle/tede/23The aim of this work is to find means to minimize the high error rate found in speech recognition systems which are trained on adult speakers and tested on children speakers. In this regard, we propose the use of the GMM-UBM method as an alternative to the HMM method to find the optimal warping factor (?-optimal) for children speakers when the speaker normalization technique is used. The adopted normalization technique was VTLN, which normalizes the vocal tract of different children speakers through the use of mel filterbank frequency warping. The assessment of this technique also aimed to find the optimal mixture quantity that improves the system performance. Thus, the error rate in the system trained with adults and tested on children was reduced from 4,95% to 1,88% when VTLN was used with ?-optimals found by HMM and to 1,92% when VTLN was used with ?-optimals found by GMM. It was noticed that the application of VTLN technique using ?-optimals found by GMM-UBM method achieved a similar performance to HMM in the experiments. From the experiments it was observed that choosing GMM-UBM method turns to be more suitable due to its implementation simplicity and to the need of lower computational cost, being thus an alternative to HMM in the use of VTLN in Speech Recognition Systems for children speakers.Nesta disserta??o s?o abordadas formas de minimizar a alta taxa de erros em sistemas de reconhecimento de fala treinados com locutores adultos e testado com locutores crian?as. Prop?e-se a utiliza??o do m?todo GMM-UBM como alternativa ao m?todo HMM na busca pelo fator ?timo de escalonamento (?-?timo) para locutores crian?as quando utilizada a t?cnica de normaliza??o de locutor. A t?cnica de normaliza??o adotada ? a VTLN, que normaliza o trato vocal dos diferentes locutores crian?as atrav?s do escalonamento de frequ?ncias do banco de filtros mel. Na avalia??o desta t?cnica, procurou-se tamb?m a quantidade de misturas ?timas que melhoram o desempenho do sistema. Desse modo, reduziu-se a taxa de erro no sistema treinado com adultos e testado com crian?as de 4,95% para 1,88% quando utilizado a VTLN com os ?-?timos encontrados pelo HMM e 1,92 % quando utilizado a VTLN com os ?-?timos encontrados pelo GMM-UBM. Observou-se que a aplica??o da t?cnica VTLN utilizando os ?-?timos pelo m?todo GMM-UBM obteve desempenho similar ao HMM nos experimentos. Nos experimentos realizados concluiu-se que a escolha do m?todo GMM-UBM se torna mais adequada em virtude da simplicidade de implementa??o e necessidade de menor custo computacional, sendo assim uma alternativa ao HMM para realizar VTLN em sistemas de reconhecimento de fala para usu?rios crian?as.Submitted by Tede Dspace (tede@inatel.br) on 2016-06-27T18:30:31Z No. of bitstreams: 2 license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) Dissertac?a?o V.Final Ramon Mayor Martins.pdf: 1957448 bytes, checksum: e21cd6acb902d52fc69d00903b5b1b33 (MD5)Made available in DSpace on 2016-06-27T18:30:31Z (GMT). No. of bitstreams: 2 license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) Dissertac?a?o V.Final Ramon Mayor Martins.pdf: 1957448 bytes, checksum: e21cd6acb902d52fc69d00903b5b1b33 (MD5) Previous issue date: 2014-10-09application/pdfhttp://tede.inatel.br:8080/jspui/retrieve/303/Dissertac%cc%a7a%cc%83o%20V.Final%20Ramon%20Mayor%20Martins.pdf.jpgporInstituto Nacional de Telecomunica??esMestrado em Engenharia de Telecomunica??esINATELBrasilInstituto Nacional de Telecomunica??eshttp://creativecommons.org/licenses/by-nd/4.0/info:eu-repo/semantics/openAccessNormaliza??o de locutor; sistema de reconhecimento de fala; Modelos Ocultos de Markov; Modelos de Mistura Gaussiana; VTLNEngenharia - Telecomunica??esAn?lise comparativa entre os m?todos HMM e GMM-UBM na busca pelo a-?timo dos locutores crian?as utilizando a t?cnica VTLNinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisreponame:Biblioteca Digital de Teses e Dissertações da INATELinstname:Instituto Nacional de Telecomunicações (INATEL)instacron:INATELLICENSElicense.txtlicense.txttext/plain; charset=utf-8112http://localhost:8080/tede/bitstream/tede/23/1/license.txtc6279291b293f0db82678eaa73a27769MD51CC-LICENSElicense_urllicense_urltext/plain; charset=utf-846http://localhost:8080/tede/bitstream/tede/23/2/license_url587cd8ffae15c8598ed3c46d248a3f38MD52license_textlicense_texttext/html; charset=utf-80http://localhost:8080/tede/bitstream/tede/23/3/license_textd41d8cd98f00b204e9800998ecf8427eMD53license_rdflicense_rdfapplication/rdf+xml; charset=utf-80http://localhost:8080/tede/bitstream/tede/23/4/license_rdfd41d8cd98f00b204e9800998ecf8427eMD54ORIGINALDissertac?a?o V.Final Ramon Mayor Martins.pdfDissertac?a?o V.Final Ramon Mayor Martins.pdfapplication/pdf1957448http://localhost:8080/tede/bitstream/tede/23/5/Dissertac%CC%A7a%CC%83o+V.Final+Ramon+Mayor+Martins.pdfe21cd6acb902d52fc69d00903b5b1b33MD55TEXTDissertac?a?o V.Final Ramon Mayor Martins.pdf.txtDissertac?a?o V.Final Ramon Mayor Martins.pdf.txttext/plain96524http://localhost:8080/tede/bitstream/tede/23/6/Dissertac%CC%A7a%CC%83o+V.Final+Ramon+Mayor+Martins.pdf.txta2cbce108e33bf2eb51ad1eb1a9641f4MD56THUMBNAILDissertac?a?o V.Final Ramon Mayor Martins.pdf.jpgDissertac?a?o V.Final Ramon Mayor Martins.pdf.jpgimage/jpeg3682http://localhost:8080/tede/bitstream/tede/23/7/Dissertac%CC%A7a%CC%83o+V.Final+Ramon+Mayor+Martins.pdf.jpg6ec32d36c3ca40c02ec2ffd33bff8575MD57tede/232018-04-16 16:18:28.821oai:localhost:tede/23QXV0b3Jpem8gYSBwdWJsaWNhPz9vIGRhIG1pbmhhIERpc3NlcnRhPz9vIGRlIE1lc3RyYWRvLCBlbSBmb3JtYXRvIFBERiwgY29tIGJsb3F1ZWlvIGRlIGVkaT8/bywgY29sYWdlbSBlIGM/cGlhLg==Biblioteca Digital de Teses e Dissertaçõeshttp://tede.inatel.br:8080/jspui/PUBhttp://tede.inatel.br:8080/oai/requestbiblioteca@inatel.br || biblioteca.atendimento@inatel.bropendoar:2018-04-16T19:18:28Biblioteca Digital de Teses e Dissertações da INATEL - Instituto Nacional de Telecomunicações (INATEL)false
dc.title.por.fl_str_mv An?lise comparativa entre os m?todos HMM e GMM-UBM na busca pelo a-?timo dos locutores crian?as utilizando a t?cnica VTLN
title An?lise comparativa entre os m?todos HMM e GMM-UBM na busca pelo a-?timo dos locutores crian?as utilizando a t?cnica VTLN
spellingShingle An?lise comparativa entre os m?todos HMM e GMM-UBM na busca pelo a-?timo dos locutores crian?as utilizando a t?cnica VTLN
Martins, Ramon Mayor
Normaliza??o de locutor; sistema de reconhecimento de fala; Modelos Ocultos de Markov; Modelos de Mistura Gaussiana; VTLN
Engenharia - Telecomunica??es
title_short An?lise comparativa entre os m?todos HMM e GMM-UBM na busca pelo a-?timo dos locutores crian?as utilizando a t?cnica VTLN
title_full An?lise comparativa entre os m?todos HMM e GMM-UBM na busca pelo a-?timo dos locutores crian?as utilizando a t?cnica VTLN
title_fullStr An?lise comparativa entre os m?todos HMM e GMM-UBM na busca pelo a-?timo dos locutores crian?as utilizando a t?cnica VTLN
title_full_unstemmed An?lise comparativa entre os m?todos HMM e GMM-UBM na busca pelo a-?timo dos locutores crian?as utilizando a t?cnica VTLN
title_sort An?lise comparativa entre os m?todos HMM e GMM-UBM na busca pelo a-?timo dos locutores crian?as utilizando a t?cnica VTLN
author Martins, Ramon Mayor
author_facet Martins, Ramon Mayor
author_role author
dc.contributor.advisor1.fl_str_mv Ynoguti, Carlos Alberto
dc.contributor.advisor1ID.fl_str_mv 156.167.778-70
dc.contributor.advisor1Lattes.fl_str_mv http://lattes.cnpq.br/5678667205895840
dc.contributor.referee1.fl_str_mv Ynoguti, Carlos Alberto
dc.contributor.referee1ID.fl_str_mv 156.167.778-70
dc.contributor.referee1Lattes.fl_str_mv http://lattes.cnpq.br/5678667205895840
dc.contributor.referee2.fl_str_mv Guimar?es, Dayan Adionel
dc.contributor.referee2ID.fl_str_mv 739.337.836-15
dc.contributor.referee2Lattes.fl_str_mv http://lattes.cnpq.br/2503439503631682
dc.contributor.referee3.fl_str_mv Minami, M?rio
dc.contributor.referee3Lattes.fl_str_mv http://lattes.cnpq.br/5882877274227409
dc.contributor.authorID.fl_str_mv 052.866.756-46
dc.contributor.authorLattes.fl_str_mv http://lattes.cnpq.br/6289204315531991
dc.contributor.author.fl_str_mv Martins, Ramon Mayor
contributor_str_mv Ynoguti, Carlos Alberto
Ynoguti, Carlos Alberto
Guimar?es, Dayan Adionel
Minami, M?rio
dc.subject.por.fl_str_mv Normaliza??o de locutor; sistema de reconhecimento de fala; Modelos Ocultos de Markov; Modelos de Mistura Gaussiana; VTLN
topic Normaliza??o de locutor; sistema de reconhecimento de fala; Modelos Ocultos de Markov; Modelos de Mistura Gaussiana; VTLN
Engenharia - Telecomunica??es
dc.subject.cnpq.fl_str_mv Engenharia - Telecomunica??es
description The aim of this work is to find means to minimize the high error rate found in speech recognition systems which are trained on adult speakers and tested on children speakers. In this regard, we propose the use of the GMM-UBM method as an alternative to the HMM method to find the optimal warping factor (?-optimal) for children speakers when the speaker normalization technique is used. The adopted normalization technique was VTLN, which normalizes the vocal tract of different children speakers through the use of mel filterbank frequency warping. The assessment of this technique also aimed to find the optimal mixture quantity that improves the system performance. Thus, the error rate in the system trained with adults and tested on children was reduced from 4,95% to 1,88% when VTLN was used with ?-optimals found by HMM and to 1,92% when VTLN was used with ?-optimals found by GMM. It was noticed that the application of VTLN technique using ?-optimals found by GMM-UBM method achieved a similar performance to HMM in the experiments. From the experiments it was observed that choosing GMM-UBM method turns to be more suitable due to its implementation simplicity and to the need of lower computational cost, being thus an alternative to HMM in the use of VTLN in Speech Recognition Systems for children speakers.
publishDate 2014
dc.date.issued.fl_str_mv 2014-10-09
dc.date.accessioned.fl_str_mv 2016-06-27T18:30:31Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.citation.fl_str_mv Martins, Ramon Mayor. An?lise comparativa entre os m?todos HMM e GMM-UBM na busca pelo a-?timo dos locutores crian?as utilizando a t?cnica VTLN. 2014. [60]. Disserta????o( Programa 1) - Instituto Nacional de Telecomunicacoes, [Santa Rita do Sapuca?] .
dc.identifier.uri.fl_str_mv http://tede.inatel.br:8080/tede/handle/tede/23
identifier_str_mv Martins, Ramon Mayor. An?lise comparativa entre os m?todos HMM e GMM-UBM na busca pelo a-?timo dos locutores crian?as utilizando a t?cnica VTLN. 2014. [60]. Disserta????o( Programa 1) - Instituto Nacional de Telecomunicacoes, [Santa Rita do Sapuca?] .
url http://tede.inatel.br:8080/tede/handle/tede/23
dc.language.iso.fl_str_mv por
language por
dc.rights.driver.fl_str_mv http://creativecommons.org/licenses/by-nd/4.0/
info:eu-repo/semantics/openAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by-nd/4.0/
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Instituto Nacional de Telecomunica??es
dc.publisher.program.fl_str_mv Mestrado em Engenharia de Telecomunica??es
dc.publisher.initials.fl_str_mv INATEL
dc.publisher.country.fl_str_mv Brasil
dc.publisher.department.fl_str_mv Instituto Nacional de Telecomunica??es
publisher.none.fl_str_mv Instituto Nacional de Telecomunica??es
dc.source.none.fl_str_mv reponame:Biblioteca Digital de Teses e Dissertações da INATEL
instname:Instituto Nacional de Telecomunicações (INATEL)
instacron:INATEL
instname_str Instituto Nacional de Telecomunicações (INATEL)
instacron_str INATEL
institution INATEL
reponame_str Biblioteca Digital de Teses e Dissertações da INATEL
collection Biblioteca Digital de Teses e Dissertações da INATEL
bitstream.url.fl_str_mv http://localhost:8080/tede/bitstream/tede/23/1/license.txt
http://localhost:8080/tede/bitstream/tede/23/2/license_url
http://localhost:8080/tede/bitstream/tede/23/3/license_text
http://localhost:8080/tede/bitstream/tede/23/4/license_rdf
http://localhost:8080/tede/bitstream/tede/23/5/Dissertac%CC%A7a%CC%83o+V.Final+Ramon+Mayor+Martins.pdf
http://localhost:8080/tede/bitstream/tede/23/6/Dissertac%CC%A7a%CC%83o+V.Final+Ramon+Mayor+Martins.pdf.txt
http://localhost:8080/tede/bitstream/tede/23/7/Dissertac%CC%A7a%CC%83o+V.Final+Ramon+Mayor+Martins.pdf.jpg
bitstream.checksum.fl_str_mv c6279291b293f0db82678eaa73a27769
587cd8ffae15c8598ed3c46d248a3f38
d41d8cd98f00b204e9800998ecf8427e
d41d8cd98f00b204e9800998ecf8427e
e21cd6acb902d52fc69d00903b5b1b33
a2cbce108e33bf2eb51ad1eb1a9641f4
6ec32d36c3ca40c02ec2ffd33bff8575
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
MD5
MD5
MD5
MD5
repository.name.fl_str_mv Biblioteca Digital de Teses e Dissertações da INATEL - Instituto Nacional de Telecomunicações (INATEL)
repository.mail.fl_str_mv biblioteca@inatel.br || biblioteca.atendimento@inatel.br
_version_ 1800214190323924992