Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Filters








69,533 Hits in 6.1 sec

Optimization on a Budget: A Reinforcement Learning Approach

Paul Ruvolo, Ian R. Fasel, Javier R. Movellan
2008 Neural Information Processing Systems  
Reinforcement learning (RL) is a machine learning approach to learn optimal controllers from examples and thus is an obvious candidate to improve the heuristic-based controllers implicit in the most popular  ...  Here we show that a popular modern reinforcement learning technique using a very simple state space can dramatically improve the performance of general purpose optimizers, like the LMA.  ...  Conclusion We have presented a novel approach to the problem of learning optimization procedures for optimization on a fixed budget.  ... 
dblp:conf/nips/RuvoloFM08 fatcat:kz6kjqd4ubgnpabwsukil43bja

On Jointly Optimizing Partial Offloading and SFC Mapping: A Cooperative Dual-agent Deep Reinforcement Learning Approach [article]

Xinhan Wang, Huanlai Xing, Fuhong Song, Shouxi Luo, Penglin Dai, Bowen Zhao
2022 arXiv   pre-print
To address this, we propose a cooperative dual-agent deep reinforcement learning (CDADRL) algorithm, where we design a framework enabling interaction between two agents.  ...  ., a set of ordered virtual network functions (VNFs), can be mapped on MEC servers.  ...  Deep reinforcement learning (DRL) algorithms, which combine reinforcement learning (RL) with deep neural network (DNN), appear as visible approaches to NPhard like computation offloading [5] and SFC  ... 
arXiv:2205.09925v1 fatcat:vdq3fjy4gnhnpk3h2konfdfpzy

Active Screening for Recurrent Diseases: A Reinforcement Learning Approach [article]

Han-Ching Ou, Haipeng Chen, Shahin Jabbari, Milind Tambe
2021 arXiv   pre-print
In this paper, we propose a novel reinforcement learning (RL) approach based on Deep Q-Networks (DQN), with several innovative adaptations that are designed to address the above challenges.  ...  In this approach, health workers periodically select a subset of population for screening.  ...  ACKNOWLEDGMENTS Chen and Jabbari were supported by the Center for Research on Computation and Society. This work was supported by the Army Research Office (MURI W911NF1810208).  ... 
arXiv:2101.02766v3 fatcat:m6cmfu2mqnfqtd22sucl6rn45e

Deep Policies for Online Bipartite Matching: A Reinforcement Learning Approach [article]

Mohammad Ali Alomrani, Reza Moravej, Elias B. Khalil
2022 arXiv   pre-print
We present an end-to-end Reinforcement Learning framework for deriving better matching policies based on trial-and-error on historical data.  ...  We show that most of the learning approaches perform consistently better than classical baseline algorithms on four synthetic and real-world datasets.  ...  In this work, we formulate online matching as a Markov Decision Process (MDP) for which a neural network is trained using Reinforcement Learning (RL) on past graph instances to make near-optimal matchings  ... 
arXiv:2109.10380v3 fatcat:mkbikoyyabhbzn4weq6onulvvu

A Search-Based Testing Approach for Deep Reinforcement Learning Agents [article]

Amirhossein Zolfagharian, Manel Abdellatif, Lionel Briand, Mojtaba Bagherzadeh, Ramesh S
2023 arXiv   pre-print
a limited testing budget.  ...  In this paper, we propose a Search-based Testing Approach of Reinforcement Learning Agents (STARLA) to test the policy of a DRL agent by effectively searching for failing executions of the agent within  ...  ACKNOWLEDGEMENTS This work was supported by a research grant from General Motors as well as the Canada Research Chair and Discovery Grant programs of the Natural Sciences and Engineering Research Council  ... 
arXiv:2206.07813v3 fatcat:gshj4rhx5ndyzfcjsa4azfajgq

Learning how to Active Learn: A Deep Reinforcement Learning Approach

Meng Fang, Yuan Li, Trevor Cohn
2017 Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing  
To address these shortcomings, we introduce a novel formulation by reframing the active learning as a reinforcement learning problem and explicitly learning a data selection policy, where the policy takes  ...  Active learning aims to select a small subset of data for annotation such that a classifier learned on the data is highly accurate.  ...  We adopt a reinforcement learning (RL) approach to learn a policy resulting a highly accurate model.  ... 
doi:10.18653/v1/d17-1063 dblp:conf/emnlp/FangLC17 fatcat:y4j6fuq7jjbkfhakzxo3etpuz4

Learning how to Active Learn: A Deep Reinforcement Learning Approach [article]

Meng Fang, Yuan Li, Trevor Cohn
2017 arXiv   pre-print
To address these shortcomings, we introduce a novel formulation by reframing the active learning as a reinforcement learning problem and explicitly learning a data selection policy, where the policy takes  ...  Active learning aims to select a small subset of data for annotation such that a classifier learned on the data is highly accurate.  ...  We adopt a reinforcement learning (RL) approach to learn a policy resulting a highly accurate model.  ... 
arXiv:1708.02383v1 fatcat:qzbff36oabf3rp7ny5cxkvszw4

A reinforcement learning approach to resource allocation in genomic selection [article]

Saba Moeinizade, Guiping Hu, Lizhi Wang
2021 arXiv   pre-print
Inspired by recent advances in reinforcement learning for AI problems, we develop a reinforcement learning-based algorithm to automatically learn to allocate limited resources across different generations  ...  Finally, we propose a value function approximation method to estimate the action-value function and then develop a greedy policy improvement technique to find the optimal resources.  ...  The proposed new method integrates the LAS approach in a reinforcement learning framework.  ... 
arXiv:2107.10901v1 fatcat:zhn2q6zmzzetfkfq2kfcfcposu

A Model-based Approach for Sample-efficient Multi-task Reinforcement Learning [article]

Nicholas C. Landolfi and Garrett Thomas and Tengyu Ma
2019 arXiv   pre-print
We evaluate our approach on several continuous control benchmarks and demonstrate its efficacy over MAML, a state-of-the-art meta-learning algorithm, on these tasks.  ...  The aim of multi-task reinforcement learning is two-fold: (1) efficiently learn by training against multiple tasks and (2) quickly adapt, using limited samples, to a variety of new tasks.  ...  Stochastic Lower Bound Optimization The recently proposed Stochastic Lower Bound Optimization (SLBO) method is a model-based reinforcement learning algorithm [Luo et al., 2019] .  ... 
arXiv:1907.04964v3 fatcat:gxhzuh5oozdldfrumarjmzcgf4

Age-Based Scheduling for Mobile Edge Computing: A Deep Reinforcement Learning Approach [article]

Xingqiu He, Chaoqun You, Tony Q. S. Quek
2024 arXiv   pre-print
Notably, the problem can be interpreted as a Markov Decision Process (MDP), thus enabling its solution through Reinforcement Learning (RL) algorithms.  ...  To better serve these applications, we propose a new definition of AoI and, based on the redefined AoI, we formulate an online AoI minimization problem for MEC systems.  ...  To this end, we adopt a model-free reinforcement learning approach that learns Q * ,λ and π * ,λ online. Remark: At this stage, we can elaborate on why we need constraint (7) .  ... 
arXiv:2312.00279v2 fatcat:wwia777zdvhqjat47cni2y6qua

Contingency-Aware Influence Maximization: A Reinforcement Learning Approach [article]

Haipeng Chen, Wei Qiu, Han-Ching Ou, Bo An, Milind Tambe
2021 arXiv   pre-print
Motivated by this and inspired by the line of works that use reinforcement learning (RL) to address combinatorial optimization on graphs, we formalize the problem as a Markov Decision Process (MDP), and  ...  In this study, we focus on a sub-class of IM problems, where whether the nodes are willing to be the seeds when being invited is uncertain, called contingency-aware IM.  ...  Chen was supported by the Center for Research on Computation and Society.  ... 
arXiv:2106.07039v1 fatcat:eqau22irhbaazevl5poeqv32wy

A Reinforcement Learning Approach to Sensing Design in Resource-Constrained Wireless Networked Control Systems [article]

Luca Ballotta, Giovanni Peserico, Francesco Zanini
2022 arXiv   pre-print
To tackle this design problem, we propose a Reinforcement Learning approach to learn an efficient policy that dynamically decides when measurements are to be processed at each sensor.  ...  Effectiveness of our proposed approach is validated through a numerical simulation with case study on smart sensing motivated by the Internet of Drones.  ...  To solve this problem, we propose a Reinforcement Learning (RL) approach, which is detailed in Section III.  ... 
arXiv:2204.00703v4 fatcat:obiln7xkffbvvncnup3ttgwzme

Efficient Reinforcement Learning in Block MDPs: A Model-free Representation Learning Approach [article]

Xuezhou Zhang, Yuda Song, Masatoshi Uehara, Mengdi Wang, Alekh Agarwal, Wen Sun
2022 arXiv   pre-print
BRIEE interleaves latent states discovery, exploration, and exploitation together, and can provably learn a near-optimal policy with sample complexity scaling polynomially in the number of latent states  ...  We present BRIEE (Block-structured Representation learning with Interleaved Explore Exploit), an algorithm for efficient reinforcement learning in Markov Decision Processes with block-structured dynamics  ...  to learning an -optimal policy for Block MDPs on a given reward function.  ... 
arXiv:2202.00063v3 fatcat:qrq66bmbz5apvaqg3rpndfdjqa

A Deep Reinforcement Learning Approach for Online Parcel Assignment [article]

Hao Zeng, Qiong Wu, Kunpeng Han, Junying He, Haoyuan Hu
2023 arXiv   pre-print
To tackle this challenge, we propose the PPO-OPA algorithm based on deep reinforcement learning that shows competitive performance.  ...  The OPA problem is challenging due to its stochastic nature: each parcel's candidate routes, which depends on the parcel's origin, destination, weight, etc., are unknown until its order is placed, and  ...  To bridge between reinforcement learning and online learning, Even-Dar et al.  ... 
arXiv:2109.03467v2 fatcat:ueg7w5yukzethjq5agqiufd72m

Lazy OCO: Online Convex Optimization on a Switching Budget [article]

Uri Sherman, Tomer Koren
2023 arXiv   pre-print
We study a variant of online convex optimization where the player is permitted to switch decisions at most S times in expectation throughout T rounds.  ...  In addition, for stochastic i.i.d. losses, we present a simple algorithm that performs log T switches with only a multiplicative log T factor overhead in its regret in both the general and strongly convex  ...  Acknowledgements This work was partially supported by the Israeli Science Foundation (ISF) grant no. 2549/19, by the Len Blavatnik and the Blavatnik Family foundation, and by the Yandex Initiative in Machine Learning  ... 
arXiv:2102.03803v7 fatcat:yymsvgoynjd3liuiqxqikmywhq
« Previous Showing results 1 — 15 out of 69,533 results