Spectral normalization MFCC derived features for robust speech recognition

Detalhes bibliográficos
Autor(a) principal: Lima, C. S.
Data de Publicação: 2004
Outros Autores: Tavares, Adriano, Silva, Carlos A., Oliveira, Jorge F.
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/1822/2047
Resumo: This paper presents a method for extracting MFCC parameters from a normalised power spectrum density. The underlined spectral normalisation method is based on the fact that the speech regions with less energy need more robustness, since in these regions the noise is more dominant, thus the speech is more corrupted. Less energy speech regions contain usually sounds of unvoiced nature where are included nearly half of the consonants, and are by nature the least reliable ones due to the effective noise presence even when the speech is acquired under controlled conditions. This spectral normalisation was tested under additive artificial white noise in an Isolated Speech Recogniser and showed very promising results [1]. It is well known that concerned to speech representation, MFCC parameters appear to be more effective than power spectrum based features. This paper shows how the cepstral speech representation can take advantage of the above-referred spectral normalisation and shows some results in the continuous speech recognition paradigm in clean and artificial noise conditions.
id RCAP_396de618d7ee642e50fe264db85b27be
oai_identifier_str oai:repositorium.sdum.uminho.pt:1822/2047
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Spectral normalization MFCC derived features for robust speech recognitionRobust speech recognitionFeatures mappingThis paper presents a method for extracting MFCC parameters from a normalised power spectrum density. The underlined spectral normalisation method is based on the fact that the speech regions with less energy need more robustness, since in these regions the noise is more dominant, thus the speech is more corrupted. Less energy speech regions contain usually sounds of unvoiced nature where are included nearly half of the consonants, and are by nature the least reliable ones due to the effective noise presence even when the speech is acquired under controlled conditions. This spectral normalisation was tested under additive artificial white noise in an Isolated Speech Recogniser and showed very promising results [1]. It is well known that concerned to speech representation, MFCC parameters appear to be more effective than power spectrum based features. This paper shows how the cepstral speech representation can take advantage of the above-referred spectral normalisation and shows some results in the continuous speech recognition paradigm in clean and artificial noise conditions.Universidade do MinhoLima, C. S.Tavares, AdrianoSilva, Carlos A.Oliveira, Jorge F.2004-092004-09-01T00:00:00Zconference paperinfo:eu-repo/semantics/publishedVersionapplication/pdfhttp://hdl.handle.net/1822/2047engSPECOM'2004. INTERNATIONAL CONFERENCE SPEECH AND COMPUTER, 9, Saint Petersburg, 2004.info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-05-11T05:16:10Zoai:repositorium.sdum.uminho.pt:1822/2047Portal AgregadorONGhttps://www.rcaap.pt/oai/openairemluisa.alvim@gmail.comopendoar:71602024-05-11T05:16:10Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Spectral normalization MFCC derived features for robust speech recognition
title Spectral normalization MFCC derived features for robust speech recognition
spellingShingle Spectral normalization MFCC derived features for robust speech recognition
Lima, C. S.
Robust speech recognition
Features mapping
title_short Spectral normalization MFCC derived features for robust speech recognition
title_full Spectral normalization MFCC derived features for robust speech recognition
title_fullStr Spectral normalization MFCC derived features for robust speech recognition
title_full_unstemmed Spectral normalization MFCC derived features for robust speech recognition
title_sort Spectral normalization MFCC derived features for robust speech recognition
author Lima, C. S.
author_facet Lima, C. S.
Tavares, Adriano
Silva, Carlos A.
Oliveira, Jorge F.
author_role author
author2 Tavares, Adriano
Silva, Carlos A.
Oliveira, Jorge F.
author2_role author
author
author
dc.contributor.none.fl_str_mv Universidade do Minho
dc.contributor.author.fl_str_mv Lima, C. S.
Tavares, Adriano
Silva, Carlos A.
Oliveira, Jorge F.
dc.subject.por.fl_str_mv Robust speech recognition
Features mapping
topic Robust speech recognition
Features mapping
description This paper presents a method for extracting MFCC parameters from a normalised power spectrum density. The underlined spectral normalisation method is based on the fact that the speech regions with less energy need more robustness, since in these regions the noise is more dominant, thus the speech is more corrupted. Less energy speech regions contain usually sounds of unvoiced nature where are included nearly half of the consonants, and are by nature the least reliable ones due to the effective noise presence even when the speech is acquired under controlled conditions. This spectral normalisation was tested under additive artificial white noise in an Isolated Speech Recogniser and showed very promising results [1]. It is well known that concerned to speech representation, MFCC parameters appear to be more effective than power spectrum based features. This paper shows how the cepstral speech representation can take advantage of the above-referred spectral normalisation and shows some results in the continuous speech recognition paradigm in clean and artificial noise conditions.
publishDate 2004
dc.date.none.fl_str_mv 2004-09
2004-09-01T00:00:00Z
dc.type.driver.fl_str_mv conference paper
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/1822/2047
url http://hdl.handle.net/1822/2047
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv SPECOM'2004. INTERNATIONAL CONFERENCE SPEECH AND COMPUTER, 9, Saint Petersburg, 2004.
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv mluisa.alvim@gmail.com
_version_ 1817544566956359680