Optimization on a Budget: A Reinforcement Learning Approach.

Reinforcement learning (RL) is a machine learning approach to learn optimal controllers from examples and thus is an obvious candidate to improve the heuristic-based controllers implicit in the most popular ... Here we show that a popular modern reinforcement learning technique using a very simple state space can dramatically improve the performance of general purpose optimizers, like the LMA. ... Conclusion We have presented a novel approach to the problem of learning optimization procedures for optimization on a fixed budget. ...

dblp:conf/nips/RuvoloFM08 fatcat:kz6kjqd4ubgnpabwsukil43bja

To address this, we propose a cooperative dual-agent deep reinforcement learning (CDADRL) algorithm, where we design a framework enabling interaction between two agents. ... ., a set of ordered virtual network functions (VNFs), can be mapped on MEC servers. ... Deep reinforcement learning (DRL) algorithms, which combine reinforcement learning (RL) with deep neural network (DNN), appear as visible approaches to NPhard like computation offloading [5] and SFC ...

arXiv:2205.09925v1 fatcat:vdq3fjy4gnhnpk3h2konfdfpzy

In this paper, we propose a novel reinforcement learning (RL) approach based on Deep Q-Networks (DQN), with several innovative adaptations that are designed to address the above challenges. ... In this approach, health workers periodically select a subset of population for screening. ... ACKNOWLEDGMENTS Chen and Jabbari were supported by the Center for Research on Computation and Society. This work was supported by the Army Research Office (MURI W911NF1810208). ...

arXiv:2101.02766v3 fatcat:m6cmfu2mqnfqtd22sucl6rn45e

Open Access Multiple Versions

We present an end-to-end Reinforcement Learning framework for deriving better matching policies based on trial-and-error on historical data. ... We show that most of the learning approaches perform consistently better than classical baseline algorithms on four synthetic and real-world datasets. ... In this work, we formulate online matching as a Markov Decision Process (MDP) for which a neural network is trained using Reinforcement Learning (RL) on past graph instances to make near-optimal matchings ...

arXiv:2109.10380v3 fatcat:mkbikoyyabhbzn4weq6onulvvu

Open Access Multiple Versions

a limited testing budget. ... In this paper, we propose a Search-based Testing Approach of Reinforcement Learning Agents (STARLA) to test the policy of a DRL agent by effectively searching for failing executions of the agent within ... ACKNOWLEDGEMENTS This work was supported by a research grant from General Motors as well as the Canada Research Chair and Discovery Grant programs of the Natural Sciences and Engineering Research Council ...

arXiv:2206.07813v3 fatcat:gshj4rhx5ndyzfcjsa4azfajgq

Multiple Versions

To address these shortcomings, we introduce a novel formulation by reframing the active learning as a reinforcement learning problem and explicitly learning a data selection policy, where the policy takes ... Active learning aims to select a small subset of data for annotation such that a classifier learned on the data is highly accurate. ... We adopt a reinforcement learning (RL) approach to learn a policy resulting a highly accurate model. ...

doi:10.18653/v1/d17-1063 dblp:conf/emnlp/FangLC17 fatcat:y4j6fuq7jjbkfhakzxo3etpuz4

To address these shortcomings, we introduce a novel formulation by reframing the active learning as a reinforcement learning problem and explicitly learning a data selection policy, where the policy takes ... Active learning aims to select a small subset of data for annotation such that a classifier learned on the data is highly accurate. ... We adopt a reinforcement learning (RL) approach to learn a policy resulting a highly accurate model. ...

arXiv:1708.02383v1 fatcat:qzbff36oabf3rp7ny5cxkvszw4

Inspired by recent advances in reinforcement learning for AI problems, we develop a reinforcement learning-based algorithm to automatically learn to allocate limited resources across different generations ... Finally, we propose a value function approximation method to estimate the action-value function and then develop a greedy policy improvement technique to find the optimal resources. ... The proposed new method integrates the LAS approach in a reinforcement learning framework. ...

arXiv:2107.10901v1 fatcat:zhn2q6zmzzetfkfq2kfcfcposu

Open Access

We evaluate our approach on several continuous control benchmarks and demonstrate its efficacy over MAML, a state-of-the-art meta-learning algorithm, on these tasks. ... The aim of multi-task reinforcement learning is two-fold: (1) efficiently learn by training against multiple tasks and (2) quickly adapt, using limited samples, to a variety of new tasks. ... Stochastic Lower Bound Optimization The recently proposed Stochastic Lower Bound Optimization (SLBO) method is a model-based reinforcement learning algorithm [Luo et al., 2019] . ...

arXiv:1907.04964v3 fatcat:gxhzuh5oozdldfrumarjmzcgf4

Multiple Versions

Notably, the problem can be interpreted as a Markov Decision Process (MDP), thus enabling its solution through Reinforcement Learning (RL) algorithms. ... To better serve these applications, we propose a new definition of AoI and, based on the redefined AoI, we formulate an online AoI minimization problem for MEC systems. ... To this end, we adopt a model-free reinforcement learning approach that learns Q * ,λ and π * ,λ online. Remark: At this stage, we can elaborate on why we need constraint (7) . ...

arXiv:2312.00279v2 fatcat:wwia777zdvhqjat47cni2y6qua

Multiple Versions

Motivated by this and inspired by the line of works that use reinforcement learning (RL) to address combinatorial optimization on graphs, we formalize the problem as a Markov Decision Process (MDP), and ... In this study, we focus on a sub-class of IM problems, where whether the nodes are willing to be the seeds when being invited is uncertain, called contingency-aware IM. ... Chen was supported by the Center for Research on Computation and Society. ...

arXiv:2106.07039v1 fatcat:eqau22irhbaazevl5poeqv32wy

Open Access

To tackle this design problem, we propose a Reinforcement Learning approach to learn an efficient policy that dynamically decides when measurements are to be processed at each sensor. ... Effectiveness of our proposed approach is validated through a numerical simulation with case study on smart sensing motivated by the Internet of Drones. ... To solve this problem, we propose a Reinforcement Learning (RL) approach, which is detailed in Section III. ...

arXiv:2204.00703v4 fatcat:obiln7xkffbvvncnup3ttgwzme

Multiple Versions

BRIEE interleaves latent states discovery, exploration, and exploitation together, and can provably learn a near-optimal policy with sample complexity scaling polynomially in the number of latent states ... We present BRIEE (Block-structured Representation learning with Interleaved Explore Exploit), an algorithm for efficient reinforcement learning in Markov Decision Processes with block-structured dynamics ... to learning an -optimal policy for Block MDPs on a given reward function. ...

arXiv:2202.00063v3 fatcat:qrq66bmbz5apvaqg3rpndfdjqa

Multiple Versions

To tackle this challenge, we propose the PPO-OPA algorithm based on deep reinforcement learning that shows competitive performance. ... The OPA problem is challenging due to its stochastic nature: each parcel's candidate routes, which depends on the parcel's origin, destination, weight, etc., are unknown until its order is placed, and ... To bridge between reinforcement learning and online learning, Even-Dar et al. ...

arXiv:2109.03467v2 fatcat:ueg7w5yukzethjq5agqiufd72m

Multiple Versions

We study a variant of online convex optimization where the player is permitted to switch decisions at most S times in expectation throughout T rounds. ... In addition, for stochastic i.i.d. losses, we present a simple algorithm that performs log T switches with only a multiplicative log T factor overhead in its regret in both the general and strongly convex ... Acknowledgements This work was partially supported by the Israeli Science Foundation (ISF) grant no. 2549/19, by the Len Blavatnik and the Blavatnik Family foundation, and by the Yandex Initiative in Machine Learning ...

arXiv:2102.03803v7 fatcat:yymsvgoynjd3liuiqxqikmywhq

Multiple Versions

Optimization on a Budget: A Reinforcement Learning Approach

Preserved Fulltext

On Jointly Optimizing Partial Offloading and SFC Mapping: A Cooperative Dual-agent Deep Reinforcement Learning Approach [article]

Preserved Fulltext

Active Screening for Recurrent Diseases: A Reinforcement Learning Approach [article]

Preserved Fulltext

Other Versions

Deep Policies for Online Bipartite Matching: A Reinforcement Learning Approach [article]

Preserved Fulltext

Other Versions

A Search-Based Testing Approach for Deep Reinforcement Learning Agents [article]

Preserved Fulltext

Other Versions

Learning how to Active Learn: A Deep Reinforcement Learning Approach

Preserved Fulltext

Learning how to Active Learn: A Deep Reinforcement Learning Approach [article]

Preserved Fulltext

A reinforcement learning approach to resource allocation in genomic selection [article]

Preserved Fulltext

A Model-based Approach for Sample-efficient Multi-task Reinforcement Learning [article]

Preserved Fulltext

Other Versions

Age-Based Scheduling for Mobile Edge Computing: A Deep Reinforcement Learning Approach [article]

Preserved Fulltext

Other Versions

Contingency-Aware Influence Maximization: A Reinforcement Learning Approach [article]

Preserved Fulltext

A Reinforcement Learning Approach to Sensing Design in Resource-Constrained Wireless Networked Control Systems [article]

Preserved Fulltext

Other Versions

Efficient Reinforcement Learning in Block MDPs: A Model-free Representation Learning Approach [article]

Preserved Fulltext

Other Versions

A Deep Reinforcement Learning Approach for Online Parcel Assignment [article]

Preserved Fulltext

Other Versions

Lazy OCO: Online Convex Optimization on a Switching Budget [article]

Preserved Fulltext

Other Versions