Audio-based cold-start in music recommendation systems

Detalhes bibliográficos
Autor(a) principal: Borges, Rodrigo Carvalho
Data de Publicação: 2022
Tipo de documento: Tese
Idioma: eng
Título da fonte: Biblioteca Digital de Teses e Dissertações da USP
Texto Completo: https://www.teses.usp.br/teses/disponiveis/45/45134/tde-14102022-124655/
Resumo: Music streaming platforms have become popular in the last decades due to the increasing number of tracks available online. The track catalogues offered by these platforms are usually too big to be searched manually, and automatic recommendation algorithms might be implemented for helping users navigate on these platforms. More specifically, Music Recommendation Systems (MRS) are designed for analyzing user listening behaviours and for predicting the songs that will be played in the near future by one specific user or within a listening session. But in the case new tracks are added to a platform, also known as the cold-start problem, no listening data is available, and the system needs to somehow incorporate these tracks into its recommendation algorithms. In this work, we propose methods that leverage the audio associated with tracks that were recently added to streaming platforms as an alternative for compensating the lack of interaction data. Our propositions are elaborated considering collaborative filtering (CF), sequence-aware (SA), and stream-based (SB) recommendation systems, and audio files are considered represented as codeword histograms, Mel-spectrograms, and raw waveforms. In the first experiment, we propose a method that applies Convolutional Neural Networks (CNN) for mapping audio content to profiles containing the users who listened to a track. In a second experiment, Recurrent Neural Networks (RNN) are trained for reproducing the audio feature associated with the upcoming tracks within a listening session, given the audio feature associated with the current track. An inverted index structure is used for retrieving tracks given their estimated audio feature in an efficient way. In a third experiment, we propose a model that maps track/track transitions to an audio domain in a multi-level Markov Chain fashion. The method allows dynamic updates, allowing its application to scenarios of data streams. The experiments were conducted using the LFM-1b music consumption dataset, and audio previews downloaded from Spotify. Our methods presented competitive prediction results in situations of cold-start in the case of CF and SA recommendation systems. The novel stream-based method is able to recommend tracks with an accuracy that is comparable to the accuracy measured for conventional rating-based methods, being based exclusively on audio content.
id USP_e2a769b1b1d7cba77e358dd42e446c1a
oai_identifier_str oai:teses.usp.br:tde-14102022-124655
network_acronym_str USP
network_name_str Biblioteca Digital de Teses e Dissertações da USP
repository_id_str 2721
spelling Audio-based cold-start in music recommendation systemsSistemas de recomendação de música baseados em áudioAudio contentAudio-based music recommendationCold-startConteúdo de áudioMusic recommendation systemsSistemas de recomendação de músicaSistemas de recomendação de música baseados em áudioMusic streaming platforms have become popular in the last decades due to the increasing number of tracks available online. The track catalogues offered by these platforms are usually too big to be searched manually, and automatic recommendation algorithms might be implemented for helping users navigate on these platforms. More specifically, Music Recommendation Systems (MRS) are designed for analyzing user listening behaviours and for predicting the songs that will be played in the near future by one specific user or within a listening session. But in the case new tracks are added to a platform, also known as the cold-start problem, no listening data is available, and the system needs to somehow incorporate these tracks into its recommendation algorithms. In this work, we propose methods that leverage the audio associated with tracks that were recently added to streaming platforms as an alternative for compensating the lack of interaction data. Our propositions are elaborated considering collaborative filtering (CF), sequence-aware (SA), and stream-based (SB) recommendation systems, and audio files are considered represented as codeword histograms, Mel-spectrograms, and raw waveforms. In the first experiment, we propose a method that applies Convolutional Neural Networks (CNN) for mapping audio content to profiles containing the users who listened to a track. In a second experiment, Recurrent Neural Networks (RNN) are trained for reproducing the audio feature associated with the upcoming tracks within a listening session, given the audio feature associated with the current track. An inverted index structure is used for retrieving tracks given their estimated audio feature in an efficient way. In a third experiment, we propose a model that maps track/track transitions to an audio domain in a multi-level Markov Chain fashion. The method allows dynamic updates, allowing its application to scenarios of data streams. The experiments were conducted using the LFM-1b music consumption dataset, and audio previews downloaded from Spotify. Our methods presented competitive prediction results in situations of cold-start in the case of CF and SA recommendation systems. The novel stream-based method is able to recommend tracks with an accuracy that is comparable to the accuracy measured for conventional rating-based methods, being based exclusively on audio content.Plataformas de streaming de música se tornaram populares nas últimas décadas devido ao crescente número de faixas disponíveis on-line. Os catálogos de faixas oferecidos por estas plataformas são, geralmente, muito grandes para serem pesquisados manualmente, e algoritmos de recomendação automática podem ser implementados para ajudar os usuários a navegar nestas plataformas. Mais especificamente, Sistemas de Recomendação Musical (MRS) são projetados para analisar os comportamentos de escuta dos usuários e para prever as músicas que serão tocadas em um futuro próximo por um usuário específico ou dentro de uma sessão de escuta. Mas quando novas faixas são adicionadas a uma plataforma, também conhecido como problema de cold-start, os dados de audição não estão disponíveis e o sistema precisa incorporar estas faixas em seus algoritmos de alguma forma. Neste trabalho, propomos métodos que utilizam o áudio associado às faixas que foram recentemente adicionadas às plataformas de streaming como uma alternativa para compensar a falta de dados de interação. Nossas propostas são elaboradas considerando sistemas de recomendação baseados em Filtragem Colaborativa (CF), em sequências de dados de escuta (SA) e em stream de dados de escuta (SB). Os arquivos de áudio são considerados representados como histogramas de palavra-chave, mel-spectrogramas e formas de onda puras. Em um primeira experimento, propomos um método que aplica Convolutional Neural Networks (CNN) para mapear conteúdo de áudio a um perfil contendo os usuários que ouviram a uma faixa. Em um segundo experimento, Redes Neurais Recorrentes (RNN) são treinadas para reproduzir os conteúdos de áudio associados às próximas faixas dentro de uma sessão de escuta, dado o conteúdo de áudio associado à faixa atual. Uma estrutura de índice invertido é usada para a recuperação de faixas, dado seu conteúdo de áudio de forma eficiente. Em um terceiro experimento, propomos um modelo que mapeia as transições de faixa/faixa para um domínio de áudio utilizando uma cadeia de Markov de vários níveis. O método permite atualizações dinâmicas, permitindo sua aplicação a cenários de intenso fluxo de dados. Os experimentos foram conduzidos utilizando o conjunto de dados de consumo de música LFM-1b, e previews de áudio baixados de Spotify. Nossos métodos apresentaram resultados de previsão competitivos em situações de cold-start no caso de sistemas de recomendação CF e SA. O novo método baseado em fluxo é capaz de recomendar faixas com uma precisão comparável à precisão medida para métodos convencionais baseados em dados de escuta, sendo baseado exclusivamente no conteúdo de áudio.Biblioteca Digitais de Teses e Dissertações da USPQueiroz, Marcelo Gomes deBorges, Rodrigo Carvalho2022-07-20info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisapplication/pdfhttps://www.teses.usp.br/teses/disponiveis/45/45134/tde-14102022-124655/reponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USPLiberar o conteúdo para acesso público.info:eu-repo/semantics/openAccesseng2023-01-30T22:38:18Zoai:teses.usp.br:tde-14102022-124655Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.bropendoar:27212023-01-30T22:38:18Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false
dc.title.none.fl_str_mv Audio-based cold-start in music recommendation systems
Sistemas de recomendação de música baseados em áudio
title Audio-based cold-start in music recommendation systems
spellingShingle Audio-based cold-start in music recommendation systems
Borges, Rodrigo Carvalho
Audio content
Audio-based music recommendation
Cold-start
Conteúdo de áudio
Music recommendation systems
Sistemas de recomendação de música
Sistemas de recomendação de música baseados em áudio
title_short Audio-based cold-start in music recommendation systems
title_full Audio-based cold-start in music recommendation systems
title_fullStr Audio-based cold-start in music recommendation systems
title_full_unstemmed Audio-based cold-start in music recommendation systems
title_sort Audio-based cold-start in music recommendation systems
author Borges, Rodrigo Carvalho
author_facet Borges, Rodrigo Carvalho
author_role author
dc.contributor.none.fl_str_mv Queiroz, Marcelo Gomes de
dc.contributor.author.fl_str_mv Borges, Rodrigo Carvalho
dc.subject.por.fl_str_mv Audio content
Audio-based music recommendation
Cold-start
Conteúdo de áudio
Music recommendation systems
Sistemas de recomendação de música
Sistemas de recomendação de música baseados em áudio
topic Audio content
Audio-based music recommendation
Cold-start
Conteúdo de áudio
Music recommendation systems
Sistemas de recomendação de música
Sistemas de recomendação de música baseados em áudio
description Music streaming platforms have become popular in the last decades due to the increasing number of tracks available online. The track catalogues offered by these platforms are usually too big to be searched manually, and automatic recommendation algorithms might be implemented for helping users navigate on these platforms. More specifically, Music Recommendation Systems (MRS) are designed for analyzing user listening behaviours and for predicting the songs that will be played in the near future by one specific user or within a listening session. But in the case new tracks are added to a platform, also known as the cold-start problem, no listening data is available, and the system needs to somehow incorporate these tracks into its recommendation algorithms. In this work, we propose methods that leverage the audio associated with tracks that were recently added to streaming platforms as an alternative for compensating the lack of interaction data. Our propositions are elaborated considering collaborative filtering (CF), sequence-aware (SA), and stream-based (SB) recommendation systems, and audio files are considered represented as codeword histograms, Mel-spectrograms, and raw waveforms. In the first experiment, we propose a method that applies Convolutional Neural Networks (CNN) for mapping audio content to profiles containing the users who listened to a track. In a second experiment, Recurrent Neural Networks (RNN) are trained for reproducing the audio feature associated with the upcoming tracks within a listening session, given the audio feature associated with the current track. An inverted index structure is used for retrieving tracks given their estimated audio feature in an efficient way. In a third experiment, we propose a model that maps track/track transitions to an audio domain in a multi-level Markov Chain fashion. The method allows dynamic updates, allowing its application to scenarios of data streams. The experiments were conducted using the LFM-1b music consumption dataset, and audio previews downloaded from Spotify. Our methods presented competitive prediction results in situations of cold-start in the case of CF and SA recommendation systems. The novel stream-based method is able to recommend tracks with an accuracy that is comparable to the accuracy measured for conventional rating-based methods, being based exclusively on audio content.
publishDate 2022
dc.date.none.fl_str_mv 2022-07-20
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://www.teses.usp.br/teses/disponiveis/45/45134/tde-14102022-124655/
url https://www.teses.usp.br/teses/disponiveis/45/45134/tde-14102022-124655/
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv
dc.rights.driver.fl_str_mv Liberar o conteúdo para acesso público.
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Liberar o conteúdo para acesso público.
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.coverage.none.fl_str_mv
dc.publisher.none.fl_str_mv Biblioteca Digitais de Teses e Dissertações da USP
publisher.none.fl_str_mv Biblioteca Digitais de Teses e Dissertações da USP
dc.source.none.fl_str_mv
reponame:Biblioteca Digital de Teses e Dissertações da USP
instname:Universidade de São Paulo (USP)
instacron:USP
instname_str Universidade de São Paulo (USP)
instacron_str USP
institution USP
reponame_str Biblioteca Digital de Teses e Dissertações da USP
collection Biblioteca Digital de Teses e Dissertações da USP
repository.name.fl_str_mv Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)
repository.mail.fl_str_mv virginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.br
_version_ 1815256975731064832