Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Filters








21,454 Hits in 5.3 sec

Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems

Tommi S. Jaakkola, Satinder P. Singh, Michael I. Jordan
1994 Neural Information Processing Systems  
We propose and analyze a new learning algorithm to solve a certain class of non-Markov decision problems.  ...  Increasing attention has been paid to reinforcement learning algorithms in recent years, partly due to successes in the theoretical analysis of their behavior in Markov environments.  ...  Acknowledgments The authors thank Rich Sutton for pointing out errors at early stages of this work.  ... 
dblp:conf/nips/JaakkolaSJ94 fatcat:nvsm7pcrdfcuxevwx76uka2fjq

Partially Observed Markov Decision Processes: From Filtering to Controlled Sensing [Bookshelf]

Bo Wahlberg
2019 IEEE Control Systems  
Partially observed Markov decision processes (POMDPs) are a significant paradigm in real-world sequential decision making.  ...  Partially Observed Markov Decision Processes: From Filtering To Controlled Sensing is a valuable contribution to the literature on stochastic decision-making/control and stochastic optimization, and I  ... 
doi:10.1109/mcs.2019.2913493 fatcat:2eozixvhbjhzbby27rxrgqh4oa

Targets in Reinforcement Learning to solve Stackelberg Security Games [article]

Saptarashmi Bandyopadhyay, Chenqi Zhu, Philip Daniel, Joshua Morrison, Ethan Shay, John Dickerson
2022 arXiv   pre-print
Reinforcement Learning (RL) algorithms have been successfully applied to real world situations like illegal smuggling, poaching, deforestation, climate change, airport security, etc.  ...  This review investigates modeling of SSGs in RL with a focus on possible improvements of target representations in RL algorithms.  ...  These algorithms will need to be extended to multiagent RL for understanding optimal attacker policies as well. arXiv:2211.17132v1 [cs.LG] 30 Nov 2022 Partially Observable Markov Decision Process Partially  ... 
arXiv:2211.17132v1 fatcat:27spdjye5zainlmvb3pwy6riqy

The National Science Foundation Workshop on Reinforcement Learning

Sridhar Mahadevan, Leslie Pack Kaelbling
1996 The AI Magazine  
, partially observable decision problems (Russell and Parr 1995) .  ...  Markov Decision Processes and Dynamic Programming A key assumption underlying much research in reinforcement learning is that the agent-environment interaction can be viewed as a Markov decision process  ... 
doi:10.1609/aimag.v17i4.1244 dblp:journals/aim/MahadevanK96 fatcat:vrz3h6o2cnb6njmtnohflsfksa

Efficient Identification of State in Reinforcement Learning

Stephan Timmer, Martin A. Riedmiller
2009 Künstliche Intelligenz  
A very general framework for modeling uncertainty in learning environments is given by partially observable Markov Decision Processes (POMDPs).  ...  In this article, we will present a reinforcement learning algorithm for solving deterministic POMDPs based on short-term memory.  ...  Deterministic Partially Observable Markov Decision Process (POMDP) A deterministic partially observable Markov Decision Process M := (T, S, A, O, f S , f O , In general, the transition function f S and  ... 
dblp:journals/ki/TimmerR09 fatcat:mtledogt5jcqjnlpczhya4x6di

Inducing Partially Observable Markov Decision Processes

Michael L. Littman
2012 Journal of machine learning research  
Two different kinds of environment models dominate the literature-Markov Decision Processes (Puterman, 1994; Littman et al., 1995) , or MDPs, and POMDPs, their Partially Observable counterpart (White  ...  The learning problem is not as well studied, but algorithms for learning to approximately optimize an MDP with a polynomial amount of experience have been created (Kearns and Singh, 2002; Strehl et al  ... 
dblp:journals/jmlr/Littman12 fatcat:6atocal7xfbunh6sd5xyahicpi

Hierarchically Structured Scheduling and Execution of Tasks in a Multi-Agent Environment [article]

Diogo S. Carvalho, Biswa Sengupta
2022 arXiv   pre-print
Reinforcement learning, however, is suited to deal with issues requiring making sequential decisions towards a long-term, often remote, goal.  ...  We propose to use deep reinforcement learning to solve both the high-level scheduling problem and the low-level multi-agent problem of schedule execution.  ...  Markov Decision Problems Reinforcement Learning algorithms propose to solve sequential decision-making problems formally described as Markov Decision Problems (Puterman, 2014) .  ... 
arXiv:2203.03021v1 fatcat:undob22rivgbncl2bxddshbhja

Page 5650 of Mathematical Reviews Vol. , Issue 2003g [page]

2003 Mathematical Reviews  
In this paper we introduce GPOMDP, a simulation-based algorithm for generating a biased estimate of the gradient of the average reward in partially observable Markov decision processes (POMDPs) controlled  ...  Summary: “In this paper, we present algorithms that perform gradient ascent of the average reward in a partially observable Markov decision process (POMDP).  ... 

Shaping multi-agent systems with gradient reinforcement learning

Olivier Buffet, Alain Dutech, François Charpillet
2007 Autonomous Agents and Multi-Agent Systems  
An original Reinforcement Learning (RL) methodology is proposed for the design of multi-agent systems.  ...  But to cope with the difficulties inherent to RL used in that framework, we have developed an incremental learning algorithm where agents face a sequence of progressively more complex tasks.  ...  Reinforcement Learning Markov Decision Processes We first consider the ideal theoretic framework for Reinforcement Learning, that is to say Markov Decision Processes ( , ).  ... 
doi:10.1007/s10458-006-9010-5 fatcat:7pdttndjyzhtrelvedbrh5y4o4

An Extension of Profit Sharing to Partially Observable Markov Decision Processes: Proposition of PS-r* and its Evaluation
??????????????????????????????????�??????

Kazuteru Miyazaki, Shigenobu Kobayashi
2003 Transactions of the Japanese society for artificial intelligence  
Observable Markov Decision Processes (POMDPs).  ...  We know the rationality theorem of Profit Sharing(PS) [Miyazaki 94, Miyazaki 99b] and the Rational Policy Making algorithm(RPM) [Miyazaki 99a] to guarantee the rationality in a typical class of Partially  ...  Jordan: Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems, Advances in Neural InformationVol. 11, No. 5, pp. 761-768 (1996) [Konda 00] V. R. Konda and J. N.  ... 
doi:10.1527/tjsai.18.286 fatcat:zjwvgcgjbfbbzmaha26u32ry7q

Model-Free Recurrent Reinforcement Learning for AUV Horizontal Control

Yujia Huo, Yiping Li, Xisheng Feng
2018 IOP Conference Series: Materials Science and Engineering  
These control problems are considered as a POMDP (Partially Observable Markov Decision Process).  ...  In this paper, aiming at the problems of 2-DOF horizontal motion control with high precision for autonomous underwater vehicle(AUV) trajectory tracking tasks, deep reinforcement learning controllers are  ...  Systems with these problems are described as Partially-Observable Markov Decision Processes(POMDP).  ... 
doi:10.1088/1757-899x/428/1/012063 fatcat:v7vacmagabagncx5v7bre5gzmq

Learning agents for uncertain environments (extended abstract)

Stuart Russell
1998 Proceedings of the eleventh annual conference on Computational learning theory - COLT' 98  
This talk proposes a very simple "baseline architecture" for a learning agent that can handle stochastic, partially observable environments.  ...  This seems to be a very interesting problem for the COLT, UAI, and ML communities, and has been addressed in econometrics under the heading of structural estimation of Markov decision processes.  ...  Reinforcement learning (RL) methods are essentially online algorithmd for solving Markov decision processes (MDPs).  ... 
doi:10.1145/279943.279964 dblp:conf/colt/Russell98 fatcat:bnv35r7dzfdzfdhch3qea6amdy

A Deep Hierarchical Reinforcement Learning Algorithm in Partially Observable Markov Decision Processes

Tuyen P. Le, Ngo Anh Vien, Md. Abu Layek, TaeChoong Chung
2018 IEEE Access  
The problems of RL in such settings can be formulated as a partially observable Markov decision process (POMDP).  ...  We propose a hierarchical deep reinforcement learning approach for learning in hierarchical POMDP. The deep hierarchical RL algorithm is proposed to apply to both MDP and POMDP learning.  ...  Markov decision process for RL, and deep reinforcement learning.  ... 
doi:10.1109/access.2018.2854283 fatcat:rffxlckxcjg2fkucnqy53b64x4

Learning Factored Representations for Partially Observable Markov Decision Processes

Brian Sallans
1999 Neural Information Processing Systems  
The problem of reinforcement learning in a non-Markov environment is explored using a dynamic Bayesian network, where conditional independence assumptions between random variables are compactly represented  ...  The parameters are learned on-line, and approximations are used to perform inference and to compute the optimal value function.  ...  Acknowledgments We thank Geoffrey Hinton, Zoubin Ghahramani and Andy Brown for helpful discussions, the anonymous referees for valuable comments and criticism, and particularly Peter Dayan for helpful  ... 
dblp:conf/nips/Sallans99 fatcat:r4y7utdaqzfavinbo4xokiji4a

Reinforcement Learning with Hierarchies of Machines

Ronald Parr, Stuart J. Russell
1997 Neural Information Processing Systems  
We present provably convergent algorithms for problem-solving and learning with hierarchical machines and demonstrate their effectiveness on a problem with several thousand states.  ...  We present a new approach to reinforcement learning in which the policies considered by the learning process are constrained by hierarchies of partially specified machines.  ...  We expect that successful pursuit of these lines of research will provide a formal basis for understanding and unifying several seemingly disparate approaches to control, including behavior-based methods  ... 
dblp:conf/nips/ParrR97 fatcat:i5btazdpgjctdi4zg3prylaeou
« Previous Showing results 1 — 15 out of 21,454 results