Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
×
Jun 28, 2021 · We analyze the problem of estimating optimal Q-value functions for a discounted Markov decision process with discrete states and actions and ...
Jun 29, 2021 · We analyze the problem of estimating optimal $Q$-value functions for a discounted Markov decision process with discrete states and actions and ...
Jun 28, 2021 · The theory provides a precise way of distinguishing "easy" problems from "hard"ones in the context of $Q$-learning, as illustrated by an ...
Jun 28, 2021 · We analyze the problem of estimating optimal Q-value functions for a discounted Markov decision process with discrete states and actions and ...
... optimality in optimal value estimation: Adaptivity via variance-reduced Q-learning ... Instance-dependent confidence and early stopping for reinforcement learning.
Apr 9, 2024 · Instance-optimality in optimal value estimation: Adaptivity via variance-reduced Q-learning · IEEE Transactions on Information Theory ( IF 2.5 ) ...
Apr 2, 2024 · This paper makes progress toward learning Nash equilibria in two-player, zero-sum Markov games from offline data.
We propose and analyze a reinforcement learning principle that approximates the Bellman equations by enforcing their validity only along an user-defined space ...
Apr 21, 2023 · This paper investigates a model-free algorithm of broad interest in reinforcement learning, namely, Q-learning. Whereas substantial progress ...
Nov 9, 2021 · Summary: The paper proposes and analyzes the adaptive pessimistic value iteration algorithm for offline reinforcement learning. The authors ...