Music Genre Classification Using Timbral Feature Fusion on i-vector Framework

Detalhes bibliográficos
Autor(a) principal: Rajan, Rajeev
Data de Publicação: 2021
Outros Autores: Harishanker G., Athirasree C.A, Haritha S.M.
Tipo de documento: Artigo
Idioma: eng
Título da fonte: INFOCOMP: Jornal de Ciência da Computação
Texto Completo: https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1604
Resumo: A method for automatic music genre classification based on the fusion of high-level and low-level timbral descriptors is proposed. High-level features namely, i-vectors are computed from mel-frequency cepstral coefficient (MFCC)-GMM framework. Low-level timbral descriptors namely MFCC, modified group delay features (MODGDF) and timbral feature set are also computed from the audio files. Initially, the experiment is performed using i-vectors alone. Later, low-level timbral features are appended with high-level i-vector features to form a high dimensional feature vector (55 dim). Support vector machine (SVM) and deep neural network (DNN) based classifiers are employed for the experiment. The performance is evaluated using GTZAN dataset on 5 genres. With high-level i-vector features, the baseline-SVM and DNN-based classifiers report average classification accuracies (in %) of 79.30 and 80.67, respectively. A further improvement (9\%) in performance was observed when low-level timbral descriptors are fused with the i-vectors in both SVM and DNN frameworks. The results demonstrate the potential of the timbral feature fusion in the music genre classification task.
id UFLA-5_e493bd31a9bbd82fbb46901b421ba3fe
oai_identifier_str oai:infocomp.dcc.ufla.br:article/1604
network_acronym_str UFLA-5
network_name_str INFOCOMP: Jornal de Ciência da Computação
repository_id_str
spelling Music Genre Classification Using Timbral Feature Fusion on i-vector FrameworkA method for automatic music genre classification based on the fusion of high-level and low-level timbral descriptors is proposed. High-level features namely, i-vectors are computed from mel-frequency cepstral coefficient (MFCC)-GMM framework. Low-level timbral descriptors namely MFCC, modified group delay features (MODGDF) and timbral feature set are also computed from the audio files. Initially, the experiment is performed using i-vectors alone. Later, low-level timbral features are appended with high-level i-vector features to form a high dimensional feature vector (55 dim). Support vector machine (SVM) and deep neural network (DNN) based classifiers are employed for the experiment. The performance is evaluated using GTZAN dataset on 5 genres. With high-level i-vector features, the baseline-SVM and DNN-based classifiers report average classification accuracies (in %) of 79.30 and 80.67, respectively. A further improvement (9\%) in performance was observed when low-level timbral descriptors are fused with the i-vectors in both SVM and DNN frameworks. The results demonstrate the potential of the timbral feature fusion in the music genre classification task.Editora da UFLA2021-12-01info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1604INFOCOMP Journal of Computer Science; Vol. 20 No. 2 (2021): December 20211982-33631807-4545reponame:INFOCOMP: Jornal de Ciência da Computaçãoinstname:Universidade Federal de Lavras (UFLA)instacron:UFLAenghttps://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1604/569Copyright (c) 2021 Rajeev Rajan, Harishanker G., Athirasree C.A, Haritha S.M.info:eu-repo/semantics/openAccessRajan, RajeevHarishanker G.Athirasree C.AHaritha S.M.2021-12-01T17:16:52Zoai:infocomp.dcc.ufla.br:article/1604Revistahttps://infocomp.dcc.ufla.br/index.php/infocompPUBhttps://infocomp.dcc.ufla.br/index.php/infocomp/oaiinfocomp@dcc.ufla.br||apfreire@dcc.ufla.br1982-33631807-4545opendoar:2024-05-21T19:54:47.118376INFOCOMP: Jornal de Ciência da Computação - Universidade Federal de Lavras (UFLA)true
dc.title.none.fl_str_mv Music Genre Classification Using Timbral Feature Fusion on i-vector Framework
title Music Genre Classification Using Timbral Feature Fusion on i-vector Framework
spellingShingle Music Genre Classification Using Timbral Feature Fusion on i-vector Framework
Rajan, Rajeev
title_short Music Genre Classification Using Timbral Feature Fusion on i-vector Framework
title_full Music Genre Classification Using Timbral Feature Fusion on i-vector Framework
title_fullStr Music Genre Classification Using Timbral Feature Fusion on i-vector Framework
title_full_unstemmed Music Genre Classification Using Timbral Feature Fusion on i-vector Framework
title_sort Music Genre Classification Using Timbral Feature Fusion on i-vector Framework
author Rajan, Rajeev
author_facet Rajan, Rajeev
Harishanker G.
Athirasree C.A
Haritha S.M.
author_role author
author2 Harishanker G.
Athirasree C.A
Haritha S.M.
author2_role author
author
author
dc.contributor.author.fl_str_mv Rajan, Rajeev
Harishanker G.
Athirasree C.A
Haritha S.M.
description A method for automatic music genre classification based on the fusion of high-level and low-level timbral descriptors is proposed. High-level features namely, i-vectors are computed from mel-frequency cepstral coefficient (MFCC)-GMM framework. Low-level timbral descriptors namely MFCC, modified group delay features (MODGDF) and timbral feature set are also computed from the audio files. Initially, the experiment is performed using i-vectors alone. Later, low-level timbral features are appended with high-level i-vector features to form a high dimensional feature vector (55 dim). Support vector machine (SVM) and deep neural network (DNN) based classifiers are employed for the experiment. The performance is evaluated using GTZAN dataset on 5 genres. With high-level i-vector features, the baseline-SVM and DNN-based classifiers report average classification accuracies (in %) of 79.30 and 80.67, respectively. A further improvement (9\%) in performance was observed when low-level timbral descriptors are fused with the i-vectors in both SVM and DNN frameworks. The results demonstrate the potential of the timbral feature fusion in the music genre classification task.
publishDate 2021
dc.date.none.fl_str_mv 2021-12-01
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1604
url https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1604
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/1604/569
dc.rights.driver.fl_str_mv Copyright (c) 2021 Rajeev Rajan, Harishanker G., Athirasree C.A, Haritha S.M.
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Copyright (c) 2021 Rajeev Rajan, Harishanker G., Athirasree C.A, Haritha S.M.
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Editora da UFLA
publisher.none.fl_str_mv Editora da UFLA
dc.source.none.fl_str_mv INFOCOMP Journal of Computer Science; Vol. 20 No. 2 (2021): December 2021
1982-3363
1807-4545
reponame:INFOCOMP: Jornal de Ciência da Computação
instname:Universidade Federal de Lavras (UFLA)
instacron:UFLA
instname_str Universidade Federal de Lavras (UFLA)
instacron_str UFLA
institution UFLA
reponame_str INFOCOMP: Jornal de Ciência da Computação
collection INFOCOMP: Jornal de Ciência da Computação
repository.name.fl_str_mv INFOCOMP: Jornal de Ciência da Computação - Universidade Federal de Lavras (UFLA)
repository.mail.fl_str_mv infocomp@dcc.ufla.br||apfreire@dcc.ufla.br
_version_ 1799874742669279232