Recommender Systems for Grocery Retail - A Machine Learning Approach

Detalhes bibliográficos
Autor(a) principal: Silva, Xavier dos Santos
Data de Publicação: 2020
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10400.22/17716
Resumo: Recommender systems are present in our daily activities in different moments, such as when choosing a song to listen to or when doing online shopping. It is an everyday reality for people to have the help of computer systems in order to simplify regular decision activities. Grocery shopping is an essential part of people’s life and a frequent activity. Despite being a common habit, each customer has unique routines, needs and preferences regarding products and brands. This information is valuable for grocery retailers to know their customers better and to improve their marketing and operational activities. This dissertation aims to apply machine learning algorithms to the development of a recommender system capable of preparing personalized grocery shopping lists. The proposed architecture is designed to allow integration with different grocery retailers and support distinct TensorFlow algorithms. The process of extracting information from the dataset as features was explored, as well as the tuning of the model hyperparameters, to obtain better results. The recommendation engine is exposed via a distributed software architecture designed to allow retailers to integrate the recommender system with different existing solutions (e.g., websites or mobile applications). A case study to validate the implemented solution was performed, integrating it with a public dataset provided by Instacart. A comparison study between different machine learning algorithms over the adopted dataset has lead to the choice of the gradient boosted trees algorithm. The solution developed in the case study was compared against two non-machine learning approaches at predicting the last purchase of 360 arbitrary test customers. A pattern miningbased solution and a SQL-based heuristic were used. Different evaluation metrics (namely, the average accuracy, precision, recall, and f1-score) were registered. The way association rules with different strengths were reflected in the predictions of the developed solution was also analyzed. The gradient boosted trees-based implementation from the case study was capable of outperforming the compared solutions as far as evaluation metrics are concerned, and has shown a higher capability of predicting at least one correct item per customer. Also, it became evident that the strictest association rules were frequently found in the recommendations. The adopted solution and algorithm have shown promising results and a remarkable capability to provide meaningful predictions to the different customers, evidencing its capability to add value to grocery retail. Nevertheless, there is still potential for further expansion.
id RCAP_ec0d7ee5ad60de4d8f1f78044d3cc669
oai_identifier_str oai:recipp.ipp.pt:10400.22/17716
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Recommender Systems for Grocery Retail - A Machine Learning ApproachRecommender systemsGrocery retailMachine learningGradient boosted treesRecommender systems are present in our daily activities in different moments, such as when choosing a song to listen to or when doing online shopping. It is an everyday reality for people to have the help of computer systems in order to simplify regular decision activities. Grocery shopping is an essential part of people’s life and a frequent activity. Despite being a common habit, each customer has unique routines, needs and preferences regarding products and brands. This information is valuable for grocery retailers to know their customers better and to improve their marketing and operational activities. This dissertation aims to apply machine learning algorithms to the development of a recommender system capable of preparing personalized grocery shopping lists. The proposed architecture is designed to allow integration with different grocery retailers and support distinct TensorFlow algorithms. The process of extracting information from the dataset as features was explored, as well as the tuning of the model hyperparameters, to obtain better results. The recommendation engine is exposed via a distributed software architecture designed to allow retailers to integrate the recommender system with different existing solutions (e.g., websites or mobile applications). A case study to validate the implemented solution was performed, integrating it with a public dataset provided by Instacart. A comparison study between different machine learning algorithms over the adopted dataset has lead to the choice of the gradient boosted trees algorithm. The solution developed in the case study was compared against two non-machine learning approaches at predicting the last purchase of 360 arbitrary test customers. A pattern miningbased solution and a SQL-based heuristic were used. Different evaluation metrics (namely, the average accuracy, precision, recall, and f1-score) were registered. The way association rules with different strengths were reflected in the predictions of the developed solution was also analyzed. The gradient boosted trees-based implementation from the case study was capable of outperforming the compared solutions as far as evaluation metrics are concerned, and has shown a higher capability of predicting at least one correct item per customer. Also, it became evident that the strictest association rules were frequently found in the recommendations. The adopted solution and algorithm have shown promising results and a remarkable capability to provide meaningful predictions to the different customers, evidencing its capability to add value to grocery retail. Nevertheless, there is still potential for further expansion.Os sistemas de recomendação estão presentes no nosso quotidiano, em momentos como a escolha da música a ouvir ou a preparação de compras online. Estamos acostumados a contar com a ajuda de sistemas computacionais para simplificar tarefas habituais que envolvem decisões. Realizar compras de retalho alimentar é uma parte importante e frequente da nossa vida. Apesar de ser um hábito comum, cada um de nós tem as suas próprias rotinas, necessidades e preferências no que toca a produtos e marcas. Esta informação é valiosa para que os retalhistas alimentares consigam conhecer melhor os seus clientes e melhorar atividades operacionais e de marketing. Esta dissertação tem como objetivo a aplicação de algoritmos de machine learning na criação de um sistema de recomendação capaz de preparar listas de compras personalizadas. A arquitetura proposta é desenhada com o objetivo de permitir a integração com diferentes retalhistas e a utilização de diferentes algoritmos em TensorFlow. O processo de extração de informação na forma de features foi explorado, tal como a afinação dos hiperparâmetros do modelo, para obter melhores resultados. O motor de recomendações é exposto através de uma arquitetura de software distribuída, com o propósito de permitir que os retalhistas alimentares possam integrar este sistema com diferentes soluções existentes (e.g., websites ou aplicações móveis). Foi realizado um caso de estudo para validar a solução implementada, através da integração da solução com os dados públicos disponibilizados pelo retalhista Instacart. Uma comparação entre a aplicação de diferentes algoritmos de machine learning aos dados utilizados, levou à adoção do algoritmo gradient boosted trees. A solução desenvolvida no caso de estudo foi comparada com duas abordagens não baseadas em machine learning para a previsão da última compra de 360 clientes arbitrários. Foi usada uma abordagem baseada em pattern mining e uma abordagem baseada em SQL. Diferentes métricas de avaliação (nomeadamente accuracy, precision, recall e f1-score médios) foram registadas. Foi também analisada a forma como diferentes regras de associação se encontraram refletidas nas recomendações da solução desenvolvida. A implementação baseada em gradient boosted trees do caso de estudo superou as soluções com as quais foi comparada quanto às métricas de avaliação, e mostrou uma maior capacidade de recomendar pelo menos um produto correto por cliente. Verificou-se também que as regras de associação mais fortes estão frequentemente refletidas nas recomendações. A abordagem adotada e o algoritmo aprofundado mostraram resultados promissores e uma capacidade notável de fornecer recomendações úteis aos diferentes clientes, evidenciando a sua aptidão para adicionar valor ao retalho alimentar. Ainda assim, este sistema apresenta um elevado potencial para expansão.Ferreira, Carlos Manuel Abreu GomesRepositório Científico do Instituto Politécnico do PortoSilva, Xavier dos Santos2021-03-30T13:13:04Z20202020-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10400.22/17716TID:202551318enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-03-13T13:08:48Zoai:recipp.ipp.pt:10400.22/17716Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T17:37:19.658060Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Recommender Systems for Grocery Retail - A Machine Learning Approach
title Recommender Systems for Grocery Retail - A Machine Learning Approach
spellingShingle Recommender Systems for Grocery Retail - A Machine Learning Approach
Silva, Xavier dos Santos
Recommender systems
Grocery retail
Machine learning
Gradient boosted trees
title_short Recommender Systems for Grocery Retail - A Machine Learning Approach
title_full Recommender Systems for Grocery Retail - A Machine Learning Approach
title_fullStr Recommender Systems for Grocery Retail - A Machine Learning Approach
title_full_unstemmed Recommender Systems for Grocery Retail - A Machine Learning Approach
title_sort Recommender Systems for Grocery Retail - A Machine Learning Approach
author Silva, Xavier dos Santos
author_facet Silva, Xavier dos Santos
author_role author
dc.contributor.none.fl_str_mv Ferreira, Carlos Manuel Abreu Gomes
Repositório Científico do Instituto Politécnico do Porto
dc.contributor.author.fl_str_mv Silva, Xavier dos Santos
dc.subject.por.fl_str_mv Recommender systems
Grocery retail
Machine learning
Gradient boosted trees
topic Recommender systems
Grocery retail
Machine learning
Gradient boosted trees
description Recommender systems are present in our daily activities in different moments, such as when choosing a song to listen to or when doing online shopping. It is an everyday reality for people to have the help of computer systems in order to simplify regular decision activities. Grocery shopping is an essential part of people’s life and a frequent activity. Despite being a common habit, each customer has unique routines, needs and preferences regarding products and brands. This information is valuable for grocery retailers to know their customers better and to improve their marketing and operational activities. This dissertation aims to apply machine learning algorithms to the development of a recommender system capable of preparing personalized grocery shopping lists. The proposed architecture is designed to allow integration with different grocery retailers and support distinct TensorFlow algorithms. The process of extracting information from the dataset as features was explored, as well as the tuning of the model hyperparameters, to obtain better results. The recommendation engine is exposed via a distributed software architecture designed to allow retailers to integrate the recommender system with different existing solutions (e.g., websites or mobile applications). A case study to validate the implemented solution was performed, integrating it with a public dataset provided by Instacart. A comparison study between different machine learning algorithms over the adopted dataset has lead to the choice of the gradient boosted trees algorithm. The solution developed in the case study was compared against two non-machine learning approaches at predicting the last purchase of 360 arbitrary test customers. A pattern miningbased solution and a SQL-based heuristic were used. Different evaluation metrics (namely, the average accuracy, precision, recall, and f1-score) were registered. The way association rules with different strengths were reflected in the predictions of the developed solution was also analyzed. The gradient boosted trees-based implementation from the case study was capable of outperforming the compared solutions as far as evaluation metrics are concerned, and has shown a higher capability of predicting at least one correct item per customer. Also, it became evident that the strictest association rules were frequently found in the recommendations. The adopted solution and algorithm have shown promising results and a remarkable capability to provide meaningful predictions to the different customers, evidencing its capability to add value to grocery retail. Nevertheless, there is still potential for further expansion.
publishDate 2020
dc.date.none.fl_str_mv 2020
2020-01-01T00:00:00Z
2021-03-30T13:13:04Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10400.22/17716
TID:202551318
url http://hdl.handle.net/10400.22/17716
identifier_str_mv TID:202551318
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799131463491256320