Deceptive Kernel Function on Observations of Discrete POMDP
release_uyl3ryghuva5dfox3gfp2j7r6a
by
Zhili Zhang, Quanyan Zhu
2020
Abstract
This paper studies the deception applied on agent in a partially observable
Markov decision process. We introduce deceptive kernel function (the kernel)
applied to agent's observations in a discrete POMDP. Based on value iteration,
value function approximation and POMCP three characteristic algorithms used by
agent, we analyze its belief being misled by falsified observations as the
kernel's outputs and anticipate its probable threat on agent's reward and
potentially other performance. We validate our expectation and explore more
detrimental effects of the deception by experimenting on two POMDP problems.
The result shows that the kernel applied on agent's observation can affect its
belief and substantially lower its resulting rewards; meantime certain
implementation of the kernel could induce other abnormal behaviors by the
agent.
In text/plain
format
Archived Files and Locations
application/pdf 518.2 kB
file_a2k3gz3q6bf6dpdqwzy3mnmp6u
|
arxiv.org (repository) web.archive.org (webarchive) |
2008.05585v1
access all versions, variants, and formats of this works (eg, pre-prints)