A convolutional neural network approach for speech quality assesment
Autor(a) principal: | |
---|---|
Data de Publicação: | 2020 |
Tipo de documento: | Dissertação |
Idioma: | por |
Título da fonte: | Repositório Institucional da UFPE |
Texto Completo: | https://repositorio.ufpe.br/handle/123456789/38524 |
Resumo: | An important aspect of speech understanding is quality, which can be defined as the fidelity of the signal in relation to its original (or idealized) version when a comparison is allowed. Despite being a subjective issue, there are approaches to measuring speech quality. The most effective approach consists of applying subjective tests, in which individuals evaluate the quality of the speech samples, associating them with quality indexes. However, there are automatic measurement applications that operate at lower costs and generate faster responses. Such solutions can be divided into methodologies that use only the sample to be evaluated (non-reference) and those that use the degraded and reference versions of the speech sample (full-reference). Unfortunately, for many current applications, it is impossible to obtain the original speech sample, requiring the development and application of non-reference techniques. Thus, this dissertation presents a model of convolutional neural network for speech quality assessment (CNN-SQA). This is a non-reference methodology that applies fully convolutional layers as extractors of characteristics for speech representation. In addition, fully-connected layers are used to perform the quality assessment step. For the entry of the model, some visual characteristics were evaluated, despite the use of MFCC coefficients having presented the best results. Other parameters of the new model were obtained through an iterative and incremental parameter selection process. The performance of the model was evaluated by comparing it with the PESQ, ViSQOL and P.563 methodologies. Other experiments present analyzes of the model’s behavior in isolated situations of speech and noise. The experiments were carried out on publicly available databases, as well as on a new database built to evaluate the new methodology in the context of background noise. Finally, the results were analyzed using correlation measures and statistical descriptions. |
id |
UFPE_2c015cf28f9fb2d4eb0c66c2f59ef5e7 |
---|---|
oai_identifier_str |
oai:repositorio.ufpe.br:123456789/38524 |
network_acronym_str |
UFPE |
network_name_str |
Repositório Institucional da UFPE |
repository_id_str |
2221 |
spelling |
ALBUQUERQUE, Renato Quirino dehttp://lattes.cnpq.br/8473935466226177http://lattes.cnpq.br/2248591013863307MELLO, Carlos Alexandre Barros de2020-11-09T12:00:01Z2020-11-09T12:00:01Z2020-02-20ALBUQUERQUE, Renato Quirino de. A convolutional neural network approach for speech quality assesment. 2020. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal de Pernambuco, Recife, 2020.https://repositorio.ufpe.br/handle/123456789/38524An important aspect of speech understanding is quality, which can be defined as the fidelity of the signal in relation to its original (or idealized) version when a comparison is allowed. Despite being a subjective issue, there are approaches to measuring speech quality. The most effective approach consists of applying subjective tests, in which individuals evaluate the quality of the speech samples, associating them with quality indexes. However, there are automatic measurement applications that operate at lower costs and generate faster responses. Such solutions can be divided into methodologies that use only the sample to be evaluated (non-reference) and those that use the degraded and reference versions of the speech sample (full-reference). Unfortunately, for many current applications, it is impossible to obtain the original speech sample, requiring the development and application of non-reference techniques. Thus, this dissertation presents a model of convolutional neural network for speech quality assessment (CNN-SQA). This is a non-reference methodology that applies fully convolutional layers as extractors of characteristics for speech representation. In addition, fully-connected layers are used to perform the quality assessment step. For the entry of the model, some visual characteristics were evaluated, despite the use of MFCC coefficients having presented the best results. Other parameters of the new model were obtained through an iterative and incremental parameter selection process. The performance of the model was evaluated by comparing it with the PESQ, ViSQOL and P.563 methodologies. Other experiments present analyzes of the model’s behavior in isolated situations of speech and noise. The experiments were carried out on publicly available databases, as well as on a new database built to evaluate the new methodology in the context of background noise. Finally, the results were analyzed using correlation measures and statistical descriptions.Um aspecto importante do entendimento da fala é a qualidade, esta pode ser entendida como a fidelidade do sinal em relação à sua versão original (ou idealizada) quando uma comparação é permitida. Apesar de ser uma questão subjetiva, existem abordagens para medir a qualidade de fala. A abordagem mais eficaz consiste na aplicação de testes subjetivos, nos quais os indivíduos avaliam a qualidade de amostras de fala, associando-as a índicies de qualidade. No entanto, existem aplicações de medição automática que operam a custos mais baixos e geram respostas mais rápidas. Tais soluções podem ser divididas em metodologias que usam apenas a amostra a ser avaliada (non-reference) e aquelas que usam as versões degradada e de referência da amostra de fala (full-reference). Infelizmente, para muitas aplicações atuais, é impossível obter a amostra de fala original, contribuindo para o desenvolvimento e a aplicação de técnicas (non-reference). Assim, esta dissertação apresenta um modelo de rede neural convolucional para avaliação da qualidade de fala (CNN-SQA). Essa é uma metodologia (non-reference) que aplica camadas completamente convolucionais como extratores de características para representação da fala. Além disso, camadas completamente conectadas são utilizadas para executar a etapa de avaliação de qualidade. Para a entrada do modelo algumas características visuais foram avaliadas, apesar do uso de coeficientes MFCC ter apresentado os melhores resultados. Outros parâmetros do novo modelo foram obtidos através de um processo iterativo e incremental de seleção de parâmetros. O desempenho do modelo foi avaliado comparando-o com as metodologias PESQ, ViSQOL e P.563. Outros experimentos apresentam análises do comportamento do modelo em situações isoladas de fala e ruído. Os experimentos foram realizados em bancos de dados publicamente disponíveis, bem como em um novo banco de dados construído para avaliar a nova metodologia no contexto de ruído de fundo. Por fim, os resultados foram analisados usando medidas de correlação e descrições estatísticas.porUniversidade Federal de PernambucoPrograma de Pos Graduacao em Ciencia da ComputacaoUFPEBrasilAttribution-NonCommercial-NoDerivs 3.0 Brazilhttp://creativecommons.org/licenses/by-nc-nd/3.0/br/info:eu-repo/semantics/openAccessCiência da computaçãoRedes neurais convolucionaisA convolutional neural network approach for speech quality assesmentinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesismestradoreponame:Repositório Institucional da UFPEinstname:Universidade Federal de Pernambuco (UFPE)instacron:UFPEORIGINALDISSERTAÇÃO Renato Quirino de Albuquerque.pdfDISSERTAÇÃO Renato Quirino de Albuquerque.pdfapplication/pdf2863583https://repositorio.ufpe.br/bitstream/123456789/38524/1/DISSERTA%c3%87%c3%83O%20Renato%20Quirino%20de%20Albuquerque.pdf42ea4cb62f4470cb98cce2ad8b848353MD51LICENSElicense.txtlicense.txttext/plain; charset=utf-82310https://repositorio.ufpe.br/bitstream/123456789/38524/3/license.txtbd573a5ca8288eb7272482765f819534MD53CC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8811https://repositorio.ufpe.br/bitstream/123456789/38524/2/license_rdfe39d27027a6cc9cb039ad269a5db8e34MD52TEXTDISSERTAÇÃO Renato Quirino de Albuquerque.pdf.txtDISSERTAÇÃO Renato Quirino de Albuquerque.pdf.txtExtracted texttext/plain188405https://repositorio.ufpe.br/bitstream/123456789/38524/4/DISSERTA%c3%87%c3%83O%20Renato%20Quirino%20de%20Albuquerque.pdf.txt035a39982c8fa48e05a32b8739b80f47MD54THUMBNAILDISSERTAÇÃO Renato Quirino de Albuquerque.pdf.jpgDISSERTAÇÃO Renato Quirino de Albuquerque.pdf.jpgGenerated Thumbnailimage/jpeg1278https://repositorio.ufpe.br/bitstream/123456789/38524/5/DISSERTA%c3%87%c3%83O%20Renato%20Quirino%20de%20Albuquerque.pdf.jpg557efc37387cd5fdbc8562674ca3c343MD55123456789/385242020-11-10 02:16:59.136oai:repositorio.ufpe.br:123456789/38524TGljZW7Dp2EgZGUgRGlzdHJpYnVpw6fDo28gTsOjbyBFeGNsdXNpdmEKClRvZG8gZGVwb3NpdGFudGUgZGUgbWF0ZXJpYWwgbm8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgKFJJKSBkZXZlIGNvbmNlZGVyLCDDoCBVbml2ZXJzaWRhZGUgRmVkZXJhbCBkZSBQZXJuYW1idWNvIChVRlBFKSwgdW1hIExpY2Vuw6dhIGRlIERpc3RyaWJ1acOnw6NvIE7Do28gRXhjbHVzaXZhIHBhcmEgbWFudGVyIGUgdG9ybmFyIGFjZXNzw612ZWlzIG9zIHNldXMgZG9jdW1lbnRvcywgZW0gZm9ybWF0byBkaWdpdGFsLCBuZXN0ZSByZXBvc2l0w7NyaW8uCgpDb20gYSBjb25jZXNzw6NvIGRlc3RhIGxpY2Vuw6dhIG7Do28gZXhjbHVzaXZhLCBvIGRlcG9zaXRhbnRlIG1hbnTDqW0gdG9kb3Mgb3MgZGlyZWl0b3MgZGUgYXV0b3IuCl9fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXwoKTGljZW7Dp2EgZGUgRGlzdHJpYnVpw6fDo28gTsOjbyBFeGNsdXNpdmEKCkFvIGNvbmNvcmRhciBjb20gZXN0YSBsaWNlbsOnYSBlIGFjZWl0w6EtbGEsIHZvY8OqIChhdXRvciBvdSBkZXRlbnRvciBkb3MgZGlyZWl0b3MgYXV0b3JhaXMpOgoKYSkgRGVjbGFyYSBxdWUgY29uaGVjZSBhIHBvbMOtdGljYSBkZSBjb3B5cmlnaHQgZGEgZWRpdG9yYSBkbyBzZXUgZG9jdW1lbnRvOwpiKSBEZWNsYXJhIHF1ZSBjb25oZWNlIGUgYWNlaXRhIGFzIERpcmV0cml6ZXMgcGFyYSBvIFJlcG9zaXTDs3JpbyBJbnN0aXR1Y2lvbmFsIGRhIFVGUEU7CmMpIENvbmNlZGUgw6AgVUZQRSBvIGRpcmVpdG8gbsOjbyBleGNsdXNpdm8gZGUgYXJxdWl2YXIsIHJlcHJvZHV6aXIsIGNvbnZlcnRlciAoY29tbyBkZWZpbmlkbyBhIHNlZ3VpciksIGNvbXVuaWNhciBlL291IGRpc3RyaWJ1aXIsIG5vIFJJLCBvIGRvY3VtZW50byBlbnRyZWd1ZSAoaW5jbHVpbmRvIG8gcmVzdW1vL2Fic3RyYWN0KSBlbSBmb3JtYXRvIGRpZ2l0YWwgb3UgcG9yIG91dHJvIG1laW87CmQpIERlY2xhcmEgcXVlIGF1dG9yaXphIGEgVUZQRSBhIGFycXVpdmFyIG1haXMgZGUgdW1hIGPDs3BpYSBkZXN0ZSBkb2N1bWVudG8gZSBjb252ZXJ0w6otbG8sIHNlbSBhbHRlcmFyIG8gc2V1IGNvbnRlw7pkbywgcGFyYSBxdWFscXVlciBmb3JtYXRvIGRlIGZpY2hlaXJvLCBtZWlvIG91IHN1cG9ydGUsIHBhcmEgZWZlaXRvcyBkZSBzZWd1cmFuw6dhLCBwcmVzZXJ2YcOnw6NvIChiYWNrdXApIGUgYWNlc3NvOwplKSBEZWNsYXJhIHF1ZSBvIGRvY3VtZW50byBzdWJtZXRpZG8gw6kgbyBzZXUgdHJhYmFsaG8gb3JpZ2luYWwgZSBxdWUgZGV0w6ltIG8gZGlyZWl0byBkZSBjb25jZWRlciBhIHRlcmNlaXJvcyBvcyBkaXJlaXRvcyBjb250aWRvcyBuZXN0YSBsaWNlbsOnYS4gRGVjbGFyYSB0YW1iw6ltIHF1ZSBhIGVudHJlZ2EgZG8gZG9jdW1lbnRvIG7Do28gaW5mcmluZ2Ugb3MgZGlyZWl0b3MgZGUgb3V0cmEgcGVzc29hIG91IGVudGlkYWRlOwpmKSBEZWNsYXJhIHF1ZSwgbm8gY2FzbyBkbyBkb2N1bWVudG8gc3VibWV0aWRvIGNvbnRlciBtYXRlcmlhbCBkbyBxdWFsIG7Do28gZGV0w6ltIG9zIGRpcmVpdG9zIGRlCmF1dG9yLCBvYnRldmUgYSBhdXRvcml6YcOnw6NvIGlycmVzdHJpdGEgZG8gcmVzcGVjdGl2byBkZXRlbnRvciBkZXNzZXMgZGlyZWl0b3MgcGFyYSBjZWRlciDDoApVRlBFIG9zIGRpcmVpdG9zIHJlcXVlcmlkb3MgcG9yIGVzdGEgTGljZW7Dp2EgZSBhdXRvcml6YXIgYSB1bml2ZXJzaWRhZGUgYSB1dGlsaXrDoS1sb3MgbGVnYWxtZW50ZS4gRGVjbGFyYSB0YW1iw6ltIHF1ZSBlc3NlIG1hdGVyaWFsIGN1am9zIGRpcmVpdG9zIHPDo28gZGUgdGVyY2Vpcm9zIGVzdMOhIGNsYXJhbWVudGUgaWRlbnRpZmljYWRvIGUgcmVjb25oZWNpZG8gbm8gdGV4dG8gb3UgY29udGXDumRvIGRvIGRvY3VtZW50byBlbnRyZWd1ZTsKZykgU2UgbyBkb2N1bWVudG8gZW50cmVndWUgw6kgYmFzZWFkbyBlbSB0cmFiYWxobyBmaW5hbmNpYWRvIG91IGFwb2lhZG8gcG9yIG91dHJhIGluc3RpdHVpw6fDo28gcXVlIG7Do28gYSBVRlBFLCBkZWNsYXJhIHF1ZSBjdW1wcml1IHF1YWlzcXVlciBvYnJpZ2HDp8O1ZXMgZXhpZ2lkYXMgcGVsbyByZXNwZWN0aXZvIGNvbnRyYXRvIG91IGFjb3Jkby4KCkEgVUZQRSBpZGVudGlmaWNhcsOhIGNsYXJhbWVudGUgbyhzKSBub21lKHMpIGRvKHMpIGF1dG9yIChlcykgZG9zIGRpcmVpdG9zIGRvIGRvY3VtZW50byBlbnRyZWd1ZSBlIG7Do28gZmFyw6EgcXVhbHF1ZXIgYWx0ZXJhw6fDo28sIHBhcmEgYWzDqW0gZG8gcHJldmlzdG8gbmEgYWzDrW5lYSBjKS4KRepositório InstitucionalPUBhttps://repositorio.ufpe.br/oai/requestattena@ufpe.bropendoar:22212020-11-10T05:16:59Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE)false |
dc.title.pt_BR.fl_str_mv |
A convolutional neural network approach for speech quality assesment |
title |
A convolutional neural network approach for speech quality assesment |
spellingShingle |
A convolutional neural network approach for speech quality assesment ALBUQUERQUE, Renato Quirino de Ciência da computação Redes neurais convolucionais |
title_short |
A convolutional neural network approach for speech quality assesment |
title_full |
A convolutional neural network approach for speech quality assesment |
title_fullStr |
A convolutional neural network approach for speech quality assesment |
title_full_unstemmed |
A convolutional neural network approach for speech quality assesment |
title_sort |
A convolutional neural network approach for speech quality assesment |
author |
ALBUQUERQUE, Renato Quirino de |
author_facet |
ALBUQUERQUE, Renato Quirino de |
author_role |
author |
dc.contributor.authorLattes.pt_BR.fl_str_mv |
http://lattes.cnpq.br/8473935466226177 |
dc.contributor.advisorLattes.pt_BR.fl_str_mv |
http://lattes.cnpq.br/2248591013863307 |
dc.contributor.author.fl_str_mv |
ALBUQUERQUE, Renato Quirino de |
dc.contributor.advisor1.fl_str_mv |
MELLO, Carlos Alexandre Barros de |
contributor_str_mv |
MELLO, Carlos Alexandre Barros de |
dc.subject.por.fl_str_mv |
Ciência da computação Redes neurais convolucionais |
topic |
Ciência da computação Redes neurais convolucionais |
description |
An important aspect of speech understanding is quality, which can be defined as the fidelity of the signal in relation to its original (or idealized) version when a comparison is allowed. Despite being a subjective issue, there are approaches to measuring speech quality. The most effective approach consists of applying subjective tests, in which individuals evaluate the quality of the speech samples, associating them with quality indexes. However, there are automatic measurement applications that operate at lower costs and generate faster responses. Such solutions can be divided into methodologies that use only the sample to be evaluated (non-reference) and those that use the degraded and reference versions of the speech sample (full-reference). Unfortunately, for many current applications, it is impossible to obtain the original speech sample, requiring the development and application of non-reference techniques. Thus, this dissertation presents a model of convolutional neural network for speech quality assessment (CNN-SQA). This is a non-reference methodology that applies fully convolutional layers as extractors of characteristics for speech representation. In addition, fully-connected layers are used to perform the quality assessment step. For the entry of the model, some visual characteristics were evaluated, despite the use of MFCC coefficients having presented the best results. Other parameters of the new model were obtained through an iterative and incremental parameter selection process. The performance of the model was evaluated by comparing it with the PESQ, ViSQOL and P.563 methodologies. Other experiments present analyzes of the model’s behavior in isolated situations of speech and noise. The experiments were carried out on publicly available databases, as well as on a new database built to evaluate the new methodology in the context of background noise. Finally, the results were analyzed using correlation measures and statistical descriptions. |
publishDate |
2020 |
dc.date.accessioned.fl_str_mv |
2020-11-09T12:00:01Z |
dc.date.available.fl_str_mv |
2020-11-09T12:00:01Z |
dc.date.issued.fl_str_mv |
2020-02-20 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.citation.fl_str_mv |
ALBUQUERQUE, Renato Quirino de. A convolutional neural network approach for speech quality assesment. 2020. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal de Pernambuco, Recife, 2020. |
dc.identifier.uri.fl_str_mv |
https://repositorio.ufpe.br/handle/123456789/38524 |
identifier_str_mv |
ALBUQUERQUE, Renato Quirino de. A convolutional neural network approach for speech quality assesment. 2020. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal de Pernambuco, Recife, 2020. |
url |
https://repositorio.ufpe.br/handle/123456789/38524 |
dc.language.iso.fl_str_mv |
por |
language |
por |
dc.rights.driver.fl_str_mv |
Attribution-NonCommercial-NoDerivs 3.0 Brazil http://creativecommons.org/licenses/by-nc-nd/3.0/br/ info:eu-repo/semantics/openAccess |
rights_invalid_str_mv |
Attribution-NonCommercial-NoDerivs 3.0 Brazil http://creativecommons.org/licenses/by-nc-nd/3.0/br/ |
eu_rights_str_mv |
openAccess |
dc.publisher.none.fl_str_mv |
Universidade Federal de Pernambuco |
dc.publisher.program.fl_str_mv |
Programa de Pos Graduacao em Ciencia da Computacao |
dc.publisher.initials.fl_str_mv |
UFPE |
dc.publisher.country.fl_str_mv |
Brasil |
publisher.none.fl_str_mv |
Universidade Federal de Pernambuco |
dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UFPE instname:Universidade Federal de Pernambuco (UFPE) instacron:UFPE |
instname_str |
Universidade Federal de Pernambuco (UFPE) |
instacron_str |
UFPE |
institution |
UFPE |
reponame_str |
Repositório Institucional da UFPE |
collection |
Repositório Institucional da UFPE |
bitstream.url.fl_str_mv |
https://repositorio.ufpe.br/bitstream/123456789/38524/1/DISSERTA%c3%87%c3%83O%20Renato%20Quirino%20de%20Albuquerque.pdf https://repositorio.ufpe.br/bitstream/123456789/38524/3/license.txt https://repositorio.ufpe.br/bitstream/123456789/38524/2/license_rdf https://repositorio.ufpe.br/bitstream/123456789/38524/4/DISSERTA%c3%87%c3%83O%20Renato%20Quirino%20de%20Albuquerque.pdf.txt https://repositorio.ufpe.br/bitstream/123456789/38524/5/DISSERTA%c3%87%c3%83O%20Renato%20Quirino%20de%20Albuquerque.pdf.jpg |
bitstream.checksum.fl_str_mv |
42ea4cb62f4470cb98cce2ad8b848353 bd573a5ca8288eb7272482765f819534 e39d27027a6cc9cb039ad269a5db8e34 035a39982c8fa48e05a32b8739b80f47 557efc37387cd5fdbc8562674ca3c343 |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 MD5 MD5 |
repository.name.fl_str_mv |
Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE) |
repository.mail.fl_str_mv |
attena@ufpe.br |
_version_ |
1797780570066911232 |