Music Genre Classification Using Timbral Feature Fusion on i-vector Framework
Autor(a) principal: | |
---|---|
Data de Publicação: | 2021 |
Outros Autores: | , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | INFOCOMP: Jornal de Ciência da Computação |
Texto Completo: | https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1604 |
Resumo: | A method for automatic music genre classification based on the fusion of high-level and low-level timbral descriptors is proposed. High-level features namely, i-vectors are computed from mel-frequency cepstral coefficient (MFCC)-GMM framework. Low-level timbral descriptors namely MFCC, modified group delay features (MODGDF) and timbral feature set are also computed from the audio files. Initially, the experiment is performed using i-vectors alone. Later, low-level timbral features are appended with high-level i-vector features to form a high dimensional feature vector (55 dim). Support vector machine (SVM) and deep neural network (DNN) based classifiers are employed for the experiment. The performance is evaluated using GTZAN dataset on 5 genres. With high-level i-vector features, the baseline-SVM and DNN-based classifiers report average classification accuracies (in %) of 79.30 and 80.67, respectively. A further improvement (9\%) in performance was observed when low-level timbral descriptors are fused with the i-vectors in both SVM and DNN frameworks. The results demonstrate the potential of the timbral feature fusion in the music genre classification task. |
id |
UFLA-5_e493bd31a9bbd82fbb46901b421ba3fe |
---|---|
oai_identifier_str |
oai:infocomp.dcc.ufla.br:article/1604 |
network_acronym_str |
UFLA-5 |
network_name_str |
INFOCOMP: Jornal de Ciência da Computação |
repository_id_str |
|
spelling |
Music Genre Classification Using Timbral Feature Fusion on i-vector FrameworkA method for automatic music genre classification based on the fusion of high-level and low-level timbral descriptors is proposed. High-level features namely, i-vectors are computed from mel-frequency cepstral coefficient (MFCC)-GMM framework. Low-level timbral descriptors namely MFCC, modified group delay features (MODGDF) and timbral feature set are also computed from the audio files. Initially, the experiment is performed using i-vectors alone. Later, low-level timbral features are appended with high-level i-vector features to form a high dimensional feature vector (55 dim). Support vector machine (SVM) and deep neural network (DNN) based classifiers are employed for the experiment. The performance is evaluated using GTZAN dataset on 5 genres. With high-level i-vector features, the baseline-SVM and DNN-based classifiers report average classification accuracies (in %) of 79.30 and 80.67, respectively. A further improvement (9\%) in performance was observed when low-level timbral descriptors are fused with the i-vectors in both SVM and DNN frameworks. The results demonstrate the potential of the timbral feature fusion in the music genre classification task.Editora da UFLA2021-12-01info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1604INFOCOMP Journal of Computer Science; Vol. 20 No. 2 (2021): December 20211982-33631807-4545reponame:INFOCOMP: Jornal de Ciência da Computaçãoinstname:Universidade Federal de Lavras (UFLA)instacron:UFLAenghttps://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1604/569Copyright (c) 2021 Rajeev Rajan, Harishanker G., Athirasree C.A, Haritha S.M.info:eu-repo/semantics/openAccessRajan, RajeevHarishanker G.Athirasree C.AHaritha S.M.2021-12-01T17:16:52Zoai:infocomp.dcc.ufla.br:article/1604Revistahttps://infocomp.dcc.ufla.br/index.php/infocompPUBhttps://infocomp.dcc.ufla.br/index.php/infocomp/oaiinfocomp@dcc.ufla.br||apfreire@dcc.ufla.br1982-33631807-4545opendoar:2024-05-21T19:54:47.118376INFOCOMP: Jornal de Ciência da Computação - Universidade Federal de Lavras (UFLA)true |
dc.title.none.fl_str_mv |
Music Genre Classification Using Timbral Feature Fusion on i-vector Framework |
title |
Music Genre Classification Using Timbral Feature Fusion on i-vector Framework |
spellingShingle |
Music Genre Classification Using Timbral Feature Fusion on i-vector Framework Rajan, Rajeev |
title_short |
Music Genre Classification Using Timbral Feature Fusion on i-vector Framework |
title_full |
Music Genre Classification Using Timbral Feature Fusion on i-vector Framework |
title_fullStr |
Music Genre Classification Using Timbral Feature Fusion on i-vector Framework |
title_full_unstemmed |
Music Genre Classification Using Timbral Feature Fusion on i-vector Framework |
title_sort |
Music Genre Classification Using Timbral Feature Fusion on i-vector Framework |
author |
Rajan, Rajeev |
author_facet |
Rajan, Rajeev Harishanker G. Athirasree C.A Haritha S.M. |
author_role |
author |
author2 |
Harishanker G. Athirasree C.A Haritha S.M. |
author2_role |
author author author |
dc.contributor.author.fl_str_mv |
Rajan, Rajeev Harishanker G. Athirasree C.A Haritha S.M. |
description |
A method for automatic music genre classification based on the fusion of high-level and low-level timbral descriptors is proposed. High-level features namely, i-vectors are computed from mel-frequency cepstral coefficient (MFCC)-GMM framework. Low-level timbral descriptors namely MFCC, modified group delay features (MODGDF) and timbral feature set are also computed from the audio files. Initially, the experiment is performed using i-vectors alone. Later, low-level timbral features are appended with high-level i-vector features to form a high dimensional feature vector (55 dim). Support vector machine (SVM) and deep neural network (DNN) based classifiers are employed for the experiment. The performance is evaluated using GTZAN dataset on 5 genres. With high-level i-vector features, the baseline-SVM and DNN-based classifiers report average classification accuracies (in %) of 79.30 and 80.67, respectively. A further improvement (9\%) in performance was observed when low-level timbral descriptors are fused with the i-vectors in both SVM and DNN frameworks. The results demonstrate the potential of the timbral feature fusion in the music genre classification task. |
publishDate |
2021 |
dc.date.none.fl_str_mv |
2021-12-01 |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1604 |
url |
https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1604 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1604/569 |
dc.rights.driver.fl_str_mv |
Copyright (c) 2021 Rajeev Rajan, Harishanker G., Athirasree C.A, Haritha S.M. info:eu-repo/semantics/openAccess |
rights_invalid_str_mv |
Copyright (c) 2021 Rajeev Rajan, Harishanker G., Athirasree C.A, Haritha S.M. |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Editora da UFLA |
publisher.none.fl_str_mv |
Editora da UFLA |
dc.source.none.fl_str_mv |
INFOCOMP Journal of Computer Science; Vol. 20 No. 2 (2021): December 2021 1982-3363 1807-4545 reponame:INFOCOMP: Jornal de Ciência da Computação instname:Universidade Federal de Lavras (UFLA) instacron:UFLA |
instname_str |
Universidade Federal de Lavras (UFLA) |
instacron_str |
UFLA |
institution |
UFLA |
reponame_str |
INFOCOMP: Jornal de Ciência da Computação |
collection |
INFOCOMP: Jornal de Ciência da Computação |
repository.name.fl_str_mv |
INFOCOMP: Jornal de Ciência da Computação - Universidade Federal de Lavras (UFLA) |
repository.mail.fl_str_mv |
infocomp@dcc.ufla.br||apfreire@dcc.ufla.br |
_version_ |
1799874742669279232 |