Instance-optimality in optimal value estimation: Adaptivity via variance-reduced Q-learning.

AllImages Books News Maps Videos Shopping

Scholarly articles for Instance-optimality in optimal value estimation: Adaptivity via variance-reduced Q-learning.

scholar.google.com › citations

… estimation: Adaptivity via variance-reduced Q-learning
Khamaru · Cited by 21

Instance-optimality in optimal value estimation: Adaptivity via variance ...

Jun 28, 2021 · We analyze the problem of estimating optimal Q-value functions for a discounted Markov decision process with discrete states and actions and ...

(PDF) Instance-optimality in optimal value estimation: Adaptivity via ...

www.researchgate.net › publication › 35...

Jun 29, 2021 · We analyze the problem of estimating optimal $Q$-value functions for a discounted Markov decision process with discrete states and actions and ...

Instance-optimality in optimal value estimation: Adaptivity via variance ...

www.semanticscholar.org › paper › Insta...

Jun 28, 2021 · The theory provides a precise way of distinguishing "easy" problems from "hard"ones in the context of $Q$-learning, as illustrated by an ...

[PDF] Adaptivity via variance-reduced Q-learning - arXiv

arxiv.org › pdf

Jun 28, 2021 · We analyze the problem of estimating optimal Q-value functions for a discounted Markov decision process with discrete states and actions and ...

‪Koulik Khamaru‬ - ‪Google Scholar‬

scholar.google.co.in › citations

... optimality in optimal value estimation: Adaptivity via variance-reduced Q-learning ... Instance-dependent confidence and early stopping for reinforcement learning.

Instance-optimality in optimal value estimation: Adaptivity via ... - X-MOL

www.x-mol.com › paper

Apr 9, 2024 · Instance-optimality in optimal value estimation: Adaptivity via variance-reduced Q-learning · IEEE Transactions on Information Theory ( IF 2.5 ) ...

Model-Based Reinforcement Learning for Offline Zero-Sum Markov Games

pubsonline.informs.org › opre.2022.0342

Apr 2, 2024 · This paper makes progress toward learning Nash equilibria in two-player, zero-sum Markov games from offline data.

Martin J. Wainwright | Papers With Code

paperswithcode.com › author › martin-j-...

We propose and analyze a reinforcement learning principle that approximates the Bellman equations by enforcing their validity only along an user-defined space ...

Is Q-Learning Minimax Optimal? A Tight Sample Complexity Analysis

pubsonline.informs.org › opre.2023.2450

Apr 21, 2023 · This paper investigates a model-free algorithm of broad interest in reinforcement learning, namely, Q-learning. Whereas substantial progress ...

Towards Instance-Optimal Offline Reinforcement Learning with ...

openreview.net › forum

Nov 9, 2021 · Summary: The paper proposes and analyzes the adaptive pessimistic value iteration algorithm for offline reinforcement learning. The authors ...