Improving the role of unvoiced speech segments by spectral normalisation in robust speech recognition

Lima, C. S.; Almeida, Luís B.; Monteiro, João L.

Improving the role of unvoiced speech segments by spectral normalisation in robust speech recognition

Detalhes bibliográficos
Autor(a) principal:	Lima, C. S.
Data de Publicação:	2002
Outros Autores:	Almeida, Luís B., Monteiro, João L.
Idioma:	eng
Título da fonte:	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo:	http://hdl.handle.net/1822/2147
Resumo:	This paper presents a spectral normalisation based method for extraction of speech robust features in additive noise. The method has two main goals: 1) The “peaked” spectral zones, where the most speech energy is concentrated must be preserved (from clean to noisy speech features) as much as possible by the feature extraction process. Usually, these spectral regions are the most reliable due to the higher speech energy, and the frequently assumption of independence between speech and noise. 2) The speech regions with less energy need more robustness, since in these regions the noise is more dominant, thus the speech is more corrupted. Usually these speech regions correspond to unvoiced speech where are included nearly half of the consonants. The proposed normalisation will be optimal if the corrupted and the noise process have both white noise characteristics. Optimal normalisation means that the corrupting noise does not change at all the means of the observed vectors of the corrupted process. For Signal to Noise Ratio greater than 5 dB the results show that for stationary white noise, the proposed normalisation process where the noise characteristics are ignored, outperforms the conventional Markov models composition where the noise must be known. Additionally, if the noise is known, a reasonable approximation of the inverted system can easily be obtained by performing noise compensation and still increasing the recogniser performance.

Metadados do item

id	RCAP_54c90d57254b323bbdc1624fd59ba96e
oai_identifier_str	oai:repositorium.sdum.uminho.pt:1822/2147
network_acronym_str	RCAP
network_name_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str	7160
spelling	Improving the role of unvoiced speech segments by spectral normalisation in robust speech recognitionFeature robustnessRobust speech recognitionThis paper presents a spectral normalisation based method for extraction of speech robust features in additive noise. The method has two main goals: 1) The “peaked” spectral zones, where the most speech energy is concentrated must be preserved (from clean to noisy speech features) as much as possible by the feature extraction process. Usually, these spectral regions are the most reliable due to the higher speech energy, and the frequently assumption of independence between speech and noise. 2) The speech regions with less energy need more robustness, since in these regions the noise is more dominant, thus the speech is more corrupted. Usually these speech regions correspond to unvoiced speech where are included nearly half of the consonants. The proposed normalisation will be optimal if the corrupted and the noise process have both white noise characteristics. Optimal normalisation means that the corrupting noise does not change at all the means of the observed vectors of the corrupted process. For Signal to Noise Ratio greater than 5 dB the results show that for stationary white noise, the proposed normalisation process where the noise characteristics are ignored, outperforms the conventional Markov models composition where the noise must be known. Additionally, if the noise is known, a reasonable approximation of the inverted system can easily be obtained by performing noise compensation and still increasing the recogniser performance.(undefined)International Speech Communication AssociationUniversidade do MinhoLima, C. S.Almeida, Luís B.Monteiro, João L.2002-092002-09-01T00:00:00Zconference paperinfo:eu-repo/semantics/publishedVersionapplication/pdfhttp://hdl.handle.net/1822/2147engINTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING (ICSLP), 7, Denver, 2002.info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-05-11T06:00:10Zoai:repositorium.sdum.uminho.pt:1822/2147Portal AgregadorONGhttps://www.rcaap.pt/oai/openairemluisa.alvim@gmail.comopendoar:71602024-05-11T06:00:10Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv	Improving the role of unvoiced speech segments by spectral normalisation in robust speech recognition
title	Improving the role of unvoiced speech segments by spectral normalisation in robust speech recognition
spellingShingle	Improving the role of unvoiced speech segments by spectral normalisation in robust speech recognition Lima, C. S. Feature robustness Robust speech recognition
title_short	Improving the role of unvoiced speech segments by spectral normalisation in robust speech recognition
title_full	Improving the role of unvoiced speech segments by spectral normalisation in robust speech recognition
title_fullStr	Improving the role of unvoiced speech segments by spectral normalisation in robust speech recognition
title_full_unstemmed	Improving the role of unvoiced speech segments by spectral normalisation in robust speech recognition
title_sort	Improving the role of unvoiced speech segments by spectral normalisation in robust speech recognition
author	Lima, C. S.
author_facet	Lima, C. S. Almeida, Luís B. Monteiro, João L.
author_role	author
author2	Almeida, Luís B. Monteiro, João L.
author2_role	author author
dc.contributor.none.fl_str_mv	Universidade do Minho
dc.contributor.author.fl_str_mv	Lima, C. S. Almeida, Luís B. Monteiro, João L.
dc.subject.por.fl_str_mv	Feature robustness Robust speech recognition
topic	Feature robustness Robust speech recognition
description	This paper presents a spectral normalisation based method for extraction of speech robust features in additive noise. The method has two main goals: 1) The “peaked” spectral zones, where the most speech energy is concentrated must be preserved (from clean to noisy speech features) as much as possible by the feature extraction process. Usually, these spectral regions are the most reliable due to the higher speech energy, and the frequently assumption of independence between speech and noise. 2) The speech regions with less energy need more robustness, since in these regions the noise is more dominant, thus the speech is more corrupted. Usually these speech regions correspond to unvoiced speech where are included nearly half of the consonants. The proposed normalisation will be optimal if the corrupted and the noise process have both white noise characteristics. Optimal normalisation means that the corrupting noise does not change at all the means of the observed vectors of the corrupted process. For Signal to Noise Ratio greater than 5 dB the results show that for stationary white noise, the proposed normalisation process where the noise characteristics are ignored, outperforms the conventional Markov models composition where the noise must be known. Additionally, if the noise is known, a reasonable approximation of the inverted system can easily be obtained by performing noise compensation and still increasing the recogniser performance.
publishDate	2002
dc.date.none.fl_str_mv	2002-09 2002-09-01T00:00:00Z
dc.type.driver.fl_str_mv	conference paper
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/1822/2147
url	http://hdl.handle.net/1822/2147
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING (ICSLP), 7, Denver, 2002.
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.publisher.none.fl_str_mv	International Speech Communication Association
publisher.none.fl_str_mv	International Speech Communication Association
dc.source.none.fl_str_mv	reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP
instname_str	Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv	mluisa.alvim@gmail.com
_version_	1817544816177709056

Improving the role of unvoiced speech segments by spectral normalisation in robust speech recognition

Registros relacionados