Broad phonetic class definition driven by phone confusions
Autor(a) principal: | |
---|---|
Data de Publicação: | 2012 |
Outros Autores: | |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10316/102726 https://doi.org/10.1186/1687-6180-2012-158 |
Resumo: | Intermediate representations between the speech signal and phones may be used to improve discrimination among phones that are often confused. These representations are usually found according to broad phonetic classes, which are defined by a phonetician. This article proposes an alternative data-driven method to generate these classes. Phone confusion information from the analysis of the output of a phone recognition system is used to find clusters at high risk of mutual confusion. A metric is defined to compute the distance between phones. The results, using TIMIT data, show that the proposed confusion-driven phone clustering method is an attractive alternative to the approaches based on human knowledge. A hierarchical classification structure to improve phone recognition is also proposed using a discriminative weight training method. Experiments show improvements in phone recognition on the TIMIT database compared to a baseline system. |
id |
RCAP_c791e9b528e10209760e2ca1df85a9bb |
---|---|
oai_identifier_str |
oai:estudogeral.uc.pt:10316/102726 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Broad phonetic class definition driven by phone confusionsConfusion MatrixConditional Random FieldFrame Error RateDiscriminative TrainingContext WindowIntermediate representations between the speech signal and phones may be used to improve discrimination among phones that are often confused. These representations are usually found according to broad phonetic classes, which are defined by a phonetician. This article proposes an alternative data-driven method to generate these classes. Phone confusion information from the analysis of the output of a phone recognition system is used to find clusters at high risk of mutual confusion. A metric is defined to compute the distance between phones. The results, using TIMIT data, show that the proposed confusion-driven phone clustering method is an attractive alternative to the approaches based on human knowledge. A hierarchical classification structure to improve phone recognition is also proposed using a discriminative weight training method. Experiments show improvements in phone recognition on the TIMIT database compared to a baseline system.2012info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttp://hdl.handle.net/10316/102726http://hdl.handle.net/10316/102726https://doi.org/10.1186/1687-6180-2012-158eng1687-6180Lopes, CarlaPerdigão, Fernandoinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2022-10-10T20:31:39Zoai:estudogeral.uc.pt:10316/102726Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T21:19:39.668796Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Broad phonetic class definition driven by phone confusions |
title |
Broad phonetic class definition driven by phone confusions |
spellingShingle |
Broad phonetic class definition driven by phone confusions Lopes, Carla Confusion Matrix Conditional Random Field Frame Error Rate Discriminative Training Context Window |
title_short |
Broad phonetic class definition driven by phone confusions |
title_full |
Broad phonetic class definition driven by phone confusions |
title_fullStr |
Broad phonetic class definition driven by phone confusions |
title_full_unstemmed |
Broad phonetic class definition driven by phone confusions |
title_sort |
Broad phonetic class definition driven by phone confusions |
author |
Lopes, Carla |
author_facet |
Lopes, Carla Perdigão, Fernando |
author_role |
author |
author2 |
Perdigão, Fernando |
author2_role |
author |
dc.contributor.author.fl_str_mv |
Lopes, Carla Perdigão, Fernando |
dc.subject.por.fl_str_mv |
Confusion Matrix Conditional Random Field Frame Error Rate Discriminative Training Context Window |
topic |
Confusion Matrix Conditional Random Field Frame Error Rate Discriminative Training Context Window |
description |
Intermediate representations between the speech signal and phones may be used to improve discrimination among phones that are often confused. These representations are usually found according to broad phonetic classes, which are defined by a phonetician. This article proposes an alternative data-driven method to generate these classes. Phone confusion information from the analysis of the output of a phone recognition system is used to find clusters at high risk of mutual confusion. A metric is defined to compute the distance between phones. The results, using TIMIT data, show that the proposed confusion-driven phone clustering method is an attractive alternative to the approaches based on human knowledge. A hierarchical classification structure to improve phone recognition is also proposed using a discriminative weight training method. Experiments show improvements in phone recognition on the TIMIT database compared to a baseline system. |
publishDate |
2012 |
dc.date.none.fl_str_mv |
2012 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10316/102726 http://hdl.handle.net/10316/102726 https://doi.org/10.1186/1687-6180-2012-158 |
url |
http://hdl.handle.net/10316/102726 https://doi.org/10.1186/1687-6180-2012-158 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
1687-6180 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799134090696327168 |