Human-help in automated planning under uncertainty

Detalhes bibliográficos
Autor(a) principal: Franch, Ignasi Andrés
Data de Publicação: 2018
Tipo de documento: Tese
Idioma: eng
Título da fonte: Biblioteca Digital de Teses e Dissertações da USP
Texto Completo: http://www.teses.usp.br/teses/disponiveis/45/45134/tde-19122018-211701/
Resumo: Planning is the sub-area of artificial intelligence that studies the process of selecting actions to lead an agent, e.g. a robot or a softbot, to a goal state. In many realistic scenarios, any choice of actions can lead the robot into a dead-end state, that is, a state from which the goal cannot be reached. In such cases, the robot can, pro-actively, resort to human help in order to reach the goal, an approach called symbiotic autonomy. In this work, we propose two different approaches to tackle this problem: (I) contingent planning, where the initial state is partially observable, configuring a belief state, and the outcomes of the robot actions are non-deterministic; and (II) probabilistic planning, where the initial state may be partially or totally observable and the actions have probabilistic outcomes. In both approaches, the human help is considered a scarce resource that should be used only when necessary. In contingent planning, the problem is to find a policy (a function mapping belief states into actions) that: (i) guarantees the agent will always reach the goal (strong policy); (ii) guarantees that the agent will eventually reach the goal (strong cyclic policy), or (iii) does not guarantee achieving the goal (weak policy). In this scenario, we propose a contingent planning system that considers human help to transform weak policies into strong (cyclic) policies. To do so, two types of human help are included: (i) human actions that modify states and/or belief states; and (ii) human observations that modify belief states. In probabilistic planning, the problem is to find a policy (a function mapping between world states and actions) that can be one of these two types: a proper policy, where the agent has probability 1 of reaching the goal; or an improper policy, in the case of unavoidable dead-ends. In general, the goal of the agent is to find a policy that minimizes the expected accumulated cost of the actions while maximizes the probability of reaching the goal. In this scenario, this work proposes probabilistic planners that consider human help to transform improper policies into proper policies however, considering two new (alternative) criteria: either to minimize the probability of using human actions or to minimize the expected number of human actions. Furthermore, we show that optimal policies under these criteria can be efficiently computed either by increasing human action costs or given a penalty when a human help is used. Solutions proposed in both scenarios, contingent planning and probabilistic planning with human help, were evaluated over a collection of planning problems with dead-ends. The results show that: (i) all generated policies (strong (cyclic) or proper) include human help only when necessary; and (ii) we were able to find policies for contingent planning problems with up to 10^15000 belief states and for probabilistic planning problems with more than 3*10^18 physical states.
id USP_c43eb7d040a507e91a5f7589820da013
oai_identifier_str oai:teses.usp.br:tde-19122018-211701
network_acronym_str USP
network_name_str Biblioteca Digital de Teses e Dissertações da USP
repository_id_str 2721
spelling Human-help in automated planning under uncertaintyAjuda humana em planejamento automatizado sob incertezaArtificial intelligenceAutomated planningColaboração humano robotHuman-robot collaborationInteligência artificialPlanejamento automatizadoPlanejamento probabilísticoPlanejamento sob incertezaPlanning under uncertaintyProbabilistic planningPlanning is the sub-area of artificial intelligence that studies the process of selecting actions to lead an agent, e.g. a robot or a softbot, to a goal state. In many realistic scenarios, any choice of actions can lead the robot into a dead-end state, that is, a state from which the goal cannot be reached. In such cases, the robot can, pro-actively, resort to human help in order to reach the goal, an approach called symbiotic autonomy. In this work, we propose two different approaches to tackle this problem: (I) contingent planning, where the initial state is partially observable, configuring a belief state, and the outcomes of the robot actions are non-deterministic; and (II) probabilistic planning, where the initial state may be partially or totally observable and the actions have probabilistic outcomes. In both approaches, the human help is considered a scarce resource that should be used only when necessary. In contingent planning, the problem is to find a policy (a function mapping belief states into actions) that: (i) guarantees the agent will always reach the goal (strong policy); (ii) guarantees that the agent will eventually reach the goal (strong cyclic policy), or (iii) does not guarantee achieving the goal (weak policy). In this scenario, we propose a contingent planning system that considers human help to transform weak policies into strong (cyclic) policies. To do so, two types of human help are included: (i) human actions that modify states and/or belief states; and (ii) human observations that modify belief states. In probabilistic planning, the problem is to find a policy (a function mapping between world states and actions) that can be one of these two types: a proper policy, where the agent has probability 1 of reaching the goal; or an improper policy, in the case of unavoidable dead-ends. In general, the goal of the agent is to find a policy that minimizes the expected accumulated cost of the actions while maximizes the probability of reaching the goal. In this scenario, this work proposes probabilistic planners that consider human help to transform improper policies into proper policies however, considering two new (alternative) criteria: either to minimize the probability of using human actions or to minimize the expected number of human actions. Furthermore, we show that optimal policies under these criteria can be efficiently computed either by increasing human action costs or given a penalty when a human help is used. Solutions proposed in both scenarios, contingent planning and probabilistic planning with human help, were evaluated over a collection of planning problems with dead-ends. The results show that: (i) all generated policies (strong (cyclic) or proper) include human help only when necessary; and (ii) we were able to find policies for contingent planning problems with up to 10^15000 belief states and for probabilistic planning problems with more than 3*10^18 physical states.Planejamento é a subárea de Inteligência Artificial que estuda o processo de selecionar ações que levam um agente, por exemplo um robô, de um estado inicial a um estado meta. Em muitos cenários realistas, qualquer escolha de ações pode levar o robô para um estado que é um beco-sem-saída, isto é, um estado a partir do qual a meta não pode ser alcançada. Nestes casos, o robô pode, pró-ativamente, pedir ajuda humana para alcançar a meta, uma abordagem chamada autonomia simbiótica. Neste trabalho, propomos duas abordagens diferentes para tratar este problema: (I) planejamento contingente, em que o estado inicial é parcialmente observável, configurando um estado de crença, e existe não-determinismo nos resultados das ações; e (II) planejamento probabilístico, em que o estado inicial é totalmente observável e as ações tem efeitos probabilísticos. Em ambas abordagens a ajuda humana é considerada um recurso escasso e deve ser usada somente quando estritamente necessária. No planejamento contingente, o problema é encontrar uma política (mapeamento entre estados de crença e ações) com: (i) garantia de alcançar a meta (política forte); (ii) garantia de eventualmente alcançar a meta (política forte-cíclica), ou (iii) sem garantia de alcançar a meta (política fraca). Neste cenário, uma das contribuições deste trabalho é propor sistemas de planejamento contingente que considerem ajuda humana para transformar políticas fracas em políticas fortes (cíclicas). Para isso, incluímos ajuda humana de dois tipos: (i) ações que modificam estados do mundo e/ou estados de crença; e (ii) observações que modificam estados de crenças. Em planejamento probabilístico, o problema é encontrar uma política (mapeamento entre estados do mundo e ações) que pode ser de dois tipos: política própria, na qual o agente tem probabilidade 1 de alcançar a meta; ou política imprópria, caso exista um beco-sem-saída inevitável. O objetivo do agente é, em geral, encontrar uma política que minimize o custo esperado acumulado das ações enquanto maximize a probabilidade de alcançar a meta. Neste cenário, este trabalho propõe sistemas de planejamento probabilístico que considerem ajuda humana para transformar políticas impróprias em políticas próprias, porém considerando dois novos critérios: minimizar a probabilidade de usar ações do humano e minimizar o número esperado de ações do humano. Mostramos ainda que políticas ótimas sob esses novos critérios podem ser computadas de maneira eficiente considerando que ações humanas possuem um custo alto ou penalizando o agente ao pedir ajuda humana. Soluções propostas em ambos cenários, planejamento contingente e planejamento probabilístico com ajuda humana, foram empiricamente avaliadas sobre um conjunto de problemas de planejamento com becos-sem-saida. Os resultados mostram que: (i) todas as políticas geradas (fortes (cíclicas) ou próprias) incluem ajuda humana somente quando necessária; e (ii) foram encontradas políticas para problemas de planejamento contingente com até 10^15000 estados de crença e para problemas de planejamento probabilístico com até 3*10^18 estados do mundo.Biblioteca Digitais de Teses e Dissertações da USPBarros, Leliane Nunes deFranch, Ignasi Andrés2018-09-21info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisapplication/pdfhttp://www.teses.usp.br/teses/disponiveis/45/45134/tde-19122018-211701/reponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USPLiberar o conteúdo para acesso público.info:eu-repo/semantics/openAccesseng2019-06-07T18:00:11Zoai:teses.usp.br:tde-19122018-211701Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.bropendoar:27212019-06-07T18:00:11Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false
dc.title.none.fl_str_mv Human-help in automated planning under uncertainty
Ajuda humana em planejamento automatizado sob incerteza
title Human-help in automated planning under uncertainty
spellingShingle Human-help in automated planning under uncertainty
Franch, Ignasi Andrés
Artificial intelligence
Automated planning
Colaboração humano robot
Human-robot collaboration
Inteligência artificial
Planejamento automatizado
Planejamento probabilístico
Planejamento sob incerteza
Planning under uncertainty
Probabilistic planning
title_short Human-help in automated planning under uncertainty
title_full Human-help in automated planning under uncertainty
title_fullStr Human-help in automated planning under uncertainty
title_full_unstemmed Human-help in automated planning under uncertainty
title_sort Human-help in automated planning under uncertainty
author Franch, Ignasi Andrés
author_facet Franch, Ignasi Andrés
author_role author
dc.contributor.none.fl_str_mv Barros, Leliane Nunes de
dc.contributor.author.fl_str_mv Franch, Ignasi Andrés
dc.subject.por.fl_str_mv Artificial intelligence
Automated planning
Colaboração humano robot
Human-robot collaboration
Inteligência artificial
Planejamento automatizado
Planejamento probabilístico
Planejamento sob incerteza
Planning under uncertainty
Probabilistic planning
topic Artificial intelligence
Automated planning
Colaboração humano robot
Human-robot collaboration
Inteligência artificial
Planejamento automatizado
Planejamento probabilístico
Planejamento sob incerteza
Planning under uncertainty
Probabilistic planning
description Planning is the sub-area of artificial intelligence that studies the process of selecting actions to lead an agent, e.g. a robot or a softbot, to a goal state. In many realistic scenarios, any choice of actions can lead the robot into a dead-end state, that is, a state from which the goal cannot be reached. In such cases, the robot can, pro-actively, resort to human help in order to reach the goal, an approach called symbiotic autonomy. In this work, we propose two different approaches to tackle this problem: (I) contingent planning, where the initial state is partially observable, configuring a belief state, and the outcomes of the robot actions are non-deterministic; and (II) probabilistic planning, where the initial state may be partially or totally observable and the actions have probabilistic outcomes. In both approaches, the human help is considered a scarce resource that should be used only when necessary. In contingent planning, the problem is to find a policy (a function mapping belief states into actions) that: (i) guarantees the agent will always reach the goal (strong policy); (ii) guarantees that the agent will eventually reach the goal (strong cyclic policy), or (iii) does not guarantee achieving the goal (weak policy). In this scenario, we propose a contingent planning system that considers human help to transform weak policies into strong (cyclic) policies. To do so, two types of human help are included: (i) human actions that modify states and/or belief states; and (ii) human observations that modify belief states. In probabilistic planning, the problem is to find a policy (a function mapping between world states and actions) that can be one of these two types: a proper policy, where the agent has probability 1 of reaching the goal; or an improper policy, in the case of unavoidable dead-ends. In general, the goal of the agent is to find a policy that minimizes the expected accumulated cost of the actions while maximizes the probability of reaching the goal. In this scenario, this work proposes probabilistic planners that consider human help to transform improper policies into proper policies however, considering two new (alternative) criteria: either to minimize the probability of using human actions or to minimize the expected number of human actions. Furthermore, we show that optimal policies under these criteria can be efficiently computed either by increasing human action costs or given a penalty when a human help is used. Solutions proposed in both scenarios, contingent planning and probabilistic planning with human help, were evaluated over a collection of planning problems with dead-ends. The results show that: (i) all generated policies (strong (cyclic) or proper) include human help only when necessary; and (ii) we were able to find policies for contingent planning problems with up to 10^15000 belief states and for probabilistic planning problems with more than 3*10^18 physical states.
publishDate 2018
dc.date.none.fl_str_mv 2018-09-21
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://www.teses.usp.br/teses/disponiveis/45/45134/tde-19122018-211701/
url http://www.teses.usp.br/teses/disponiveis/45/45134/tde-19122018-211701/
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv
dc.rights.driver.fl_str_mv Liberar o conteúdo para acesso público.
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Liberar o conteúdo para acesso público.
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.coverage.none.fl_str_mv
dc.publisher.none.fl_str_mv Biblioteca Digitais de Teses e Dissertações da USP
publisher.none.fl_str_mv Biblioteca Digitais de Teses e Dissertações da USP
dc.source.none.fl_str_mv
reponame:Biblioteca Digital de Teses e Dissertações da USP
instname:Universidade de São Paulo (USP)
instacron:USP
instname_str Universidade de São Paulo (USP)
instacron_str USP
institution USP
reponame_str Biblioteca Digital de Teses e Dissertações da USP
collection Biblioteca Digital de Teses e Dissertações da USP
repository.name.fl_str_mv Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)
repository.mail.fl_str_mv virginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.br
_version_ 1815257140273610752