Reward shaping to improve the performance of deep reinforcement learning in perishable inventory management

Moor, Bram J. de; Gijsbrechts, Joren; Boute, Robert N.

Reward shaping to improve the performance of deep reinforcement learning in perishable inventory management

Detalhes bibliográficos
Autor(a) principal:	Moor, Bram J. de
Data de Publicação:	2022
Outros Autores:	Gijsbrechts, Joren, Boute, Robert N.
Tipo de documento:	Artigo
Idioma:	eng
Título da fonte:	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo:	http://hdl.handle.net/10400.14/43565
Resumo:	Deep reinforcement learning (DRL) has proven to be an effective, general-purpose technology to develop ‘good’ replenishment policies in inventory management. We show how transfer learning from existing, well-performing heuristics may stabilize the training process and improve the performance of DRL in inventory control. While the idea is general, we specifically implement potential-based reward shaping to a deep Q-network algorithm to manage inventory of perishable goods that, cursed by dimensionality, has proven to be notoriously complex. The application of our approach may not only improve inventory cost performance and reduce computational effort, the increased training stability may also help to gain trust in the policies obtained by black box DRL algorithms.

Metadados do item

id	RCAP_5c04762c3d33c9a4fb1510e7c132a829
oai_identifier_str	oai:repositorio.ucp.pt:10400.14/43565
network_acronym_str	RCAP
network_name_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str	7160
spelling	Reward shaping to improve the performance of deep reinforcement learning in perishable inventory managementDeep reinforcement learningInventoryPerishable inventory managementReward shapingTransfer learningDeep reinforcement learning (DRL) has proven to be an effective, general-purpose technology to develop ‘good’ replenishment policies in inventory management. We show how transfer learning from existing, well-performing heuristics may stabilize the training process and improve the performance of DRL in inventory control. While the idea is general, we specifically implement potential-based reward shaping to a deep Q-network algorithm to manage inventory of perishable goods that, cursed by dimensionality, has proven to be notoriously complex. The application of our approach may not only improve inventory cost performance and reduce computational effort, the increased training stability may also help to gain trust in the policies obtained by black box DRL algorithms.Veritati - Repositório Institucional da Universidade Católica PortuguesaMoor, Bram J. deGijsbrechts, JorenBoute, Robert N.2022-09-012024-09-01T00:00:00Z2022-09-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10400.14/43565eng0377-221710.1016/j.ejor.2021.10.04585119188665000793723100010info:eu-repo/semantics/embargoedAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-01-16T01:46:22Zoai:repositorio.ucp.pt:10400.14/43565Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T01:44:40.068345Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv	Reward shaping to improve the performance of deep reinforcement learning in perishable inventory management
title	Reward shaping to improve the performance of deep reinforcement learning in perishable inventory management
spellingShingle	Reward shaping to improve the performance of deep reinforcement learning in perishable inventory management Moor, Bram J. de Deep reinforcement learning Inventory Perishable inventory management Reward shaping Transfer learning
title_short	Reward shaping to improve the performance of deep reinforcement learning in perishable inventory management
title_full	Reward shaping to improve the performance of deep reinforcement learning in perishable inventory management
title_fullStr	Reward shaping to improve the performance of deep reinforcement learning in perishable inventory management
title_full_unstemmed	Reward shaping to improve the performance of deep reinforcement learning in perishable inventory management
title_sort	Reward shaping to improve the performance of deep reinforcement learning in perishable inventory management
author	Moor, Bram J. de
author_facet	Moor, Bram J. de Gijsbrechts, Joren Boute, Robert N.
author_role	author
author2	Gijsbrechts, Joren Boute, Robert N.
author2_role	author author
dc.contributor.none.fl_str_mv	Veritati - Repositório Institucional da Universidade Católica Portuguesa
dc.contributor.author.fl_str_mv	Moor, Bram J. de Gijsbrechts, Joren Boute, Robert N.
dc.subject.por.fl_str_mv	Deep reinforcement learning Inventory Perishable inventory management Reward shaping Transfer learning
topic	Deep reinforcement learning Inventory Perishable inventory management Reward shaping Transfer learning
description	Deep reinforcement learning (DRL) has proven to be an effective, general-purpose technology to develop ‘good’ replenishment policies in inventory management. We show how transfer learning from existing, well-performing heuristics may stabilize the training process and improve the performance of DRL in inventory control. While the idea is general, we specifically implement potential-based reward shaping to a deep Q-network algorithm to manage inventory of perishable goods that, cursed by dimensionality, has proven to be notoriously complex. The application of our approach may not only improve inventory cost performance and reduce computational effort, the increased training stability may also help to gain trust in the policies obtained by black box DRL algorithms.
publishDate	2022
dc.date.none.fl_str_mv	2022-09-01 2022-09-01T00:00:00Z 2024-09-01T00:00:00Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/article
format	article
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/10400.14/43565
url	http://hdl.handle.net/10400.14/43565
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	0377-2217 10.1016/j.ejor.2021.10.045 85119188665 000793723100010
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/embargoedAccess
eu_rights_str_mv	embargoedAccess
dc.format.none.fl_str_mv	application/pdf
dc.source.none.fl_str_mv	reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP
instname_str	Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_	1799136942300856320

Reward shaping to improve the performance of deep reinforcement learning in perishable inventory management

Registros relacionados