Spectral normalization MFCC derived features for robust speech recognition
Autor(a) principal: | |
---|---|
Data de Publicação: | 2004 |
Outros Autores: | , , |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/1822/2047 |
Resumo: | This paper presents a method for extracting MFCC parameters from a normalised power spectrum density. The underlined spectral normalisation method is based on the fact that the speech regions with less energy need more robustness, since in these regions the noise is more dominant, thus the speech is more corrupted. Less energy speech regions contain usually sounds of unvoiced nature where are included nearly half of the consonants, and are by nature the least reliable ones due to the effective noise presence even when the speech is acquired under controlled conditions. This spectral normalisation was tested under additive artificial white noise in an Isolated Speech Recogniser and showed very promising results [1]. It is well known that concerned to speech representation, MFCC parameters appear to be more effective than power spectrum based features. This paper shows how the cepstral speech representation can take advantage of the above-referred spectral normalisation and shows some results in the continuous speech recognition paradigm in clean and artificial noise conditions. |
id |
RCAP_396de618d7ee642e50fe264db85b27be |
---|---|
oai_identifier_str |
oai:repositorium.sdum.uminho.pt:1822/2047 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Spectral normalization MFCC derived features for robust speech recognitionRobust speech recognitionFeatures mappingThis paper presents a method for extracting MFCC parameters from a normalised power spectrum density. The underlined spectral normalisation method is based on the fact that the speech regions with less energy need more robustness, since in these regions the noise is more dominant, thus the speech is more corrupted. Less energy speech regions contain usually sounds of unvoiced nature where are included nearly half of the consonants, and are by nature the least reliable ones due to the effective noise presence even when the speech is acquired under controlled conditions. This spectral normalisation was tested under additive artificial white noise in an Isolated Speech Recogniser and showed very promising results [1]. It is well known that concerned to speech representation, MFCC parameters appear to be more effective than power spectrum based features. This paper shows how the cepstral speech representation can take advantage of the above-referred spectral normalisation and shows some results in the continuous speech recognition paradigm in clean and artificial noise conditions.Universidade do MinhoLima, C. S.Tavares, AdrianoSilva, Carlos A.Oliveira, Jorge F.2004-092004-09-01T00:00:00Zconference paperinfo:eu-repo/semantics/publishedVersionapplication/pdfhttp://hdl.handle.net/1822/2047engSPECOM'2004. INTERNATIONAL CONFERENCE SPEECH AND COMPUTER, 9, Saint Petersburg, 2004.info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-05-11T05:16:10Zoai:repositorium.sdum.uminho.pt:1822/2047Portal AgregadorONGhttps://www.rcaap.pt/oai/openairemluisa.alvim@gmail.comopendoar:71602024-05-11T05:16:10Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Spectral normalization MFCC derived features for robust speech recognition |
title |
Spectral normalization MFCC derived features for robust speech recognition |
spellingShingle |
Spectral normalization MFCC derived features for robust speech recognition Lima, C. S. Robust speech recognition Features mapping |
title_short |
Spectral normalization MFCC derived features for robust speech recognition |
title_full |
Spectral normalization MFCC derived features for robust speech recognition |
title_fullStr |
Spectral normalization MFCC derived features for robust speech recognition |
title_full_unstemmed |
Spectral normalization MFCC derived features for robust speech recognition |
title_sort |
Spectral normalization MFCC derived features for robust speech recognition |
author |
Lima, C. S. |
author_facet |
Lima, C. S. Tavares, Adriano Silva, Carlos A. Oliveira, Jorge F. |
author_role |
author |
author2 |
Tavares, Adriano Silva, Carlos A. Oliveira, Jorge F. |
author2_role |
author author author |
dc.contributor.none.fl_str_mv |
Universidade do Minho |
dc.contributor.author.fl_str_mv |
Lima, C. S. Tavares, Adriano Silva, Carlos A. Oliveira, Jorge F. |
dc.subject.por.fl_str_mv |
Robust speech recognition Features mapping |
topic |
Robust speech recognition Features mapping |
description |
This paper presents a method for extracting MFCC parameters from a normalised power spectrum density. The underlined spectral normalisation method is based on the fact that the speech regions with less energy need more robustness, since in these regions the noise is more dominant, thus the speech is more corrupted. Less energy speech regions contain usually sounds of unvoiced nature where are included nearly half of the consonants, and are by nature the least reliable ones due to the effective noise presence even when the speech is acquired under controlled conditions. This spectral normalisation was tested under additive artificial white noise in an Isolated Speech Recogniser and showed very promising results [1]. It is well known that concerned to speech representation, MFCC parameters appear to be more effective than power spectrum based features. This paper shows how the cepstral speech representation can take advantage of the above-referred spectral normalisation and shows some results in the continuous speech recognition paradigm in clean and artificial noise conditions. |
publishDate |
2004 |
dc.date.none.fl_str_mv |
2004-09 2004-09-01T00:00:00Z |
dc.type.driver.fl_str_mv |
conference paper |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/1822/2047 |
url |
http://hdl.handle.net/1822/2047 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
SPECOM'2004. INTERNATIONAL CONFERENCE SPEECH AND COMPUTER, 9, Saint Petersburg, 2004. |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
mluisa.alvim@gmail.com |
_version_ |
1817544566956359680 |