A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
Modeling Human Decision Making in Generalized Gaussian Multiarmed Bandits
2014
Proceedings of the IEEE
| In this paper, we present a formal model of human decision making in explore-exploit tasks using the context of multiarmed bandit problems, where the decision maker must choose among multiple options ...
We focus on the case of Gaussian rewards in a setting where the decision maker uses Bayesian inference to estimate the reward values. ...
Cohen for their input, which helped make possible the strong connection of this work to the psychology literature. ...
doi:10.1109/jproc.2014.2307024
fatcat:6xwlrab5ynbu5ag7qnjj544ihq
Modeling Human Decision-making in Generalized Gaussian Multi-armed Bandits
[article]
2019
arXiv
pre-print
We present a formal model of human decision-making in explore-exploit tasks using the context of multi-armed bandit problems, where the decision-maker must choose among multiple options with uncertain ...
We focus on the case of Gaussian rewards in a setting where the decision-maker uses Bayesian inference to estimate the reward values. ...
Cohen for their input, which helped make possible the strong connection of this work to the psychology literature. ...
arXiv:1307.6134v5
fatcat:gpucau2sxzb4dj2p3bh3jksloi
Addictive Games: Case Study on Multi-Armed Bandit Game
2021
Information
This article mainly focuses on expanding on the idea of the motion in mind model in the scene of Multiarmed Bandit games, quantifying the player's psychological inclination by simulation experimental data ...
Also, the Multiarmed Bandit game is a typical test for Skinner Box design and is most popular in the gambling house, which is a good example to analyze. ...
Bandit in this simulation follows Gaussian distribution, where every arm follows the Gaussian distribution. ...
doi:10.3390/info12120521
fatcat:hth47aalinhdjoqs5rpf3hf5yy
Algorithmic models of human decision making in Gaussian multi-armed bandit problems
2014
2014 European Control Conference (ECC)
We consider a heuristic Bayesian algorithm as a model of human decision making in multi-armed bandit problems with Gaussian rewards. ...
The stochastic algorithm encodes many of the observed features of human decision making. ...
Application to human decision making Human decision making in multi-armed bandit problems is well modeled by a heuristic similar to that of UCL (11) and humans are sensitive to the parameters of the ...
doi:10.1109/ecc.2014.6862580
dblp:conf/eucc/ReverdySL14
fatcat:meuumb5h7zeenjo5gmegz5iwzi
On optimal foraging and multi-armed bandits
2013
2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton)
We observe that the multi-armed bandit problem with transition costs and the associated block allocation algorithm capture the key features of popular animal foraging models in literature. ...
We consider two variants of the standard multiarmed bandit problem, namely, the multi-armed bandit problem with transition costs and the multi-armed bandit problem on graphs. ...
[6] established the optimality of a Bayesian UCB algorithm for Gaussian rewards and drew several connections between these algorithms and human decision-making. ...
doi:10.1109/allerton.2013.6736565
dblp:conf/allerton/SrivastavaRL13
fatcat:k7wo7zzmjrfpfdclg53zzpqadi
Satisficing in multi-armed bandit problems
[article]
2016
arXiv
pre-print
Satisficing is a relaxation of maximizing and allows for less risky decision making in the face of uncertainty. ...
We propose two sets of satisficing objectives for the multi-armed bandit problem, where the objective is to achieve reward-based decision-making performance above a given threshold. ...
Finally, it is well understood that satisficing is an important feature of human decision making [29] and that the UCL algorithm can model many features of human decision decision making in bandit tasks ...
arXiv:1512.07638v2
fatcat:ourj3y6dpjgm7lepgvkvhk4epi
Satisficing in Multi-Armed Bandit Problems
2017
IEEE Transactions on Automatic Control
Satisficing is a relaxation of maximizing and allows for less risky decision making in the face of uncertainty. ...
We propose two sets of satisficing objectives for the multi-armed bandit problem, where the objective is to achieve reward-based decision-making performance above a given threshold. ...
Finally, it is well understood that satisficing is an important feature of human decision making [29] and that the UCL algorithm can model many features of human decision decision making in bandit tasks ...
doi:10.1109/tac.2016.2644380
fatcat:d4xp4cajp5d2fpjtngw6tjvn4u
Robot fast adaptation to changes in human engagement during simulated dynamic social interaction with active exploration in parameterized reinforcement learning
2018
IEEE Transactions on Cognitive and Developmental Systems
action in human-robot interaction scenarios, mainly at the lower level of a multiarmed bandit framework. ...
on a table), hence in essence similar to the nonstationary multiarmed bandit paradigm. ...
doi:10.1109/tcds.2018.2843122
fatcat:5c64vbuft5dklhotrt2kkg5w7q
The Knowledge Gradient Algorithm for a General Class of Online Learning Problems
2012
Operations Research
Our approach is able to handle the case where our prior beliefs about the rewards are correlated, which is not handled by traditional multiarmed bandit methods. ...
Experiments show that our KG policy performs competitively against the best-known approximation to the optimal policy in the classic bandit problem, and it outperforms many learning policies in the correlated ...
This research was supported in part by AFOSR contract FA9550-08-1-0195 and ONR contract N00014-07-1-0150 through the Center for Dynamic Data Analysis. ...
doi:10.1287/opre.1110.0999
fatcat:c54svouocbhchhrsuml4xlrmeq
Monte Carlo Search Algorithm Discovery for Single-Player Games
2013
IEEE Transactions on Computational Intelligence and AI in Games
We rely on multiarmed bandits to approximately solve this optimization problem. ...
We also show that the discovered algorithms are generally quite robust with respect to changes in the distribution over the training problems. ...
He is currently an Associate Professor at the University of Liège, where he is affiliated with the Systems and Modeling Research Unit. He is also the holder of the EDF-Luminus Chair on Smart Grids. ...
doi:10.1109/tciaig.2013.2239295
fatcat:hucv2zgzfneyhjfh72ggsl545e
Putting bandits into context: How function learning supports decision making
2018
Journal of Experimental Psychology. Learning, Memory and Cognition
We introduce the contextual multi-armed bandit task as a framework to investigate learning and decision making in uncertain environments. ...
Participants are mostly able to learn about the context-reward functions and their behaviour is best described by a Gaussian process learning strategy which generalizes previous experience to similar instances ...
Incorporating context into models of reinforcement learning and decision making generally provides a fruitful avenue for future research. ...
doi:10.1037/xlm0000463
pmid:29130693
fatcat:euxutvfh7rbfpkfull3hdns7vq
Putting bandits into context: How function learning supports decision making
[article]
2016
bioRxiv
pre-print
We introduce the contextual multi-armed bandit task as a framework to investigate learning and decision making in uncertain environments. ...
We model participants' behaviour by context-blind (mean-tracking, Kalman filter) and contextual (Gaussian process regression parametrized with different kernels) learning approaches combined with different ...
Incorporating context into models of reinforcement learning and decision making generally provides a fruitful avenue for future research. ...
doi:10.1101/081091
fatcat:3jpqpzybhrhlthoeqvsdmseine
Aversion to Option Loss in a Restless Bandit Task
2018
Computational Brain & Behavior
A Kalman filter model using Thompson sampling provides an excellent account of human learning in a standard restless bandit task, but there are systematic departures in the vanishing bandit task. ...
Inspired by work in the judgment and decision-making literature, we present two experiments using multi-armed bandit tasks in both static and dynamic environments, in situations where options can become ...
For simpler versions of the multiarmed bandit problem, there are closed-form solutions for optimal decisions (Whittle 1980) , but in general this is not the case (see Burtini et al. 2015) . ...
doi:10.1007/s42113-018-0010-8
fatcat:qzvqyeid7zgh7pqzj5fsinwze4
Correlated Multiarmed Bandit Problem: Bayesian Algorithms and Regret Analysis
[article]
2015
arXiv
pre-print
We consider the correlated multiarmed bandit (MAB) problem in which the rewards associated with each arm are modeled by a multivariate Gaussian random variable, and we investigate the influence of the ...
We rigorously characterize the influence of accuracy, confidence, and correlation scale in the prior on the decision-making performance of the algorithms. ...
It is also shown that a variation of the UCL algorithm models human decision-making in an MAB task. ...
arXiv:1507.01160v2
fatcat:xaepyacedngjtbhf45s7zbjutm
On Distributed Cooperative Decision-Making in Multiarmed Bandits
[article]
2019
arXiv
pre-print
We study the explore-exploit tradeoff in distributed cooperative decision-making using the context of the multiarmed bandit (MAB) problem. ...
We rigorously analyze the performance of the cooperative UCB algorithm and characterize the influence of communication graph structure on the decision-making performance of the group. ...
Running consensus and related models have been used to study learning [12] and decision-making [23] in social networks. ...
arXiv:1512.06888v3
fatcat:baf6suveurf4zanewhjkxh7fuy
« Previous
Showing results 1 — 15 out of 242 results