A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Filters
Data Pricing in Machine Learning Pipelines
[article]
2021
arXiv
pre-print
In this article, we survey the principles and the latest research development of data pricing in machine learning pipelines. We start with a brief review of data marketplaces and pricing desiderata. ...
As machine learning pipelines involve many parties and, in order to be successful, have to form a constructive and dynamic eco-system, marketplaces and data pricing are fundamental in connecting and facilitating ...
They develop a multi-armed bandit algorithm to extend the DG model [20] , which dynamically adjusts the parameter β in Equation 6 . A larger β encourages more accurate labels but costs more money. ...
arXiv:2108.07915v1
fatcat:736zip2pbndupl7hixdbfz33om
Dynamic pricing and learning: Historical origins, current research, and new directions
2015
Surveys in Operations Research and Management Science
Dynamic pricing and learning is a research topic that has received a considerable amount of attention in recent years, from different scientific communities: operations research and management science, ...
pricing, and provide an in-depth overview of the available literature on dynamic pricing and learning. ...
This survey is a considerable extension of the literature review in chapter 2 of den Boer (2013) . ...
doi:10.1016/j.sorms.2015.03.001
fatcat:f226ingmdvelpd3ows57sp7ine
Dynamic Pricing and Learning: Historical Origins, Current Research, and New Directions
2013
Social Science Research Network
Dynamic pricing and learning is a research topic that has received a considerable amount of attention in recent years, from different scientific communities: operations research and management science, ...
pricing, and provide an in-depth overview of the available literature on dynamic pricing and learning. ...
This survey is a considerable extension of the literature review in chapter 2 of den Boer (2013) . ...
doi:10.2139/ssrn.2334429
fatcat:4zpajef5abhazaqc7jzicqj4ie
Online Pricing with Reserve Price Constraint for Personal Data Markets
[article]
2019
arXiv
pre-print
We thus propose a contextual dynamic pricing mechanism with the reserve price constraint, which features the properties of ellipsoid for efficient online optimization, and can support linear and non-linear ...
market value models with uncertainty. ...
In fact, the contextual dynamic pricing problem can also be modeled into a contextual multi-armed bandit (MAB), where the arms/actions to be exploited and explored are the domain of the weight vector. ...
arXiv:1911.12598v1
fatcat:xawjaohzrfghhcdl52npcn4feq
On the Differential Private Data Market: Endogenous Evolution, Dynamic Pricing, and Incentive Compatibility
[article]
2021
arXiv
pre-print
and the time-varying nature of privacy concerns. ...
This work uses a mechanism design approach to study the optimal market model to economize the value of privacy of personal data, using differential privacy. ...
Dynamic privacy pricing: A multi-armed bandit
approach with time-variant rewards. ...
arXiv:2101.04357v2
fatcat:womp4nuzwzfnhkqcl4o25gagxa
A multi-armed bandit formulation for distributed appliances scheduling in smart grids
2014
2014 IEEE Online Conference on Green Communications (OnlineGreenComm)
In defining these methods, we model the appliances scheduling problem as a Multi-Armed Bandit (MAB) problem, a classical formulation of decision theory. ...
In order to converge to the equilibrium of the game, we adopt an efficient learning algorithm proposed in the literature, Exp3, along with two variants that we propose to speed up convergence. ...
Markov decision process multi-armed bandit problem. ...
doi:10.1109/onlinegreencom.2014.7114418
dblp:conf/onlinegreencomm/BarbatoCMP14
fatcat:dwyw7ezmobg7nmyhmsvrg4quzi
Automatic ad format selection via contextual bandits
2013
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management - CIKM '13
To balance exploration with exploitation, we pose automatic layout selection as a contextual bandit problem. There are many bandit algorithms, each generating a policy which must be evaluated. ...
reward of using a layout we already know is effective (exploitation). ...
-ngreedy(c,d)
A variant of -greedy algorithm, where decreases
with time. Parameters c and d control the speed
of deceasing [3].
-first( ) ...
doi:10.1145/2505515.2514700
dblp:conf/cikm/TangRSA13
fatcat:pd5ypbxtanb23otit7teyikpci
Adaptively Optimize Content Recommendation Using Multi Armed Bandit Algorithms in E-commerce
[article]
2021
arXiv
pre-print
Multi armed bandit models (MAB) as a type of adaptive optimization algorithms provide possible approaches for such purposes. ...
Second, we compare the accumulative rewards of the three MAB algorithms with more than 1,000 trials using actual historical A/B test datasets. ...
We thank Sarfaraz Hussein for providing the testing variants. We thank Priyanka Kommidi and Achal Dalal for the effort and support in the A/B test. ...
arXiv:2108.01440v2
fatcat:z5fgjpv2cfayjhfu4li4ns77ty
SAMBA: A Generic Framework for Secure Federated Multi-Armed Bandits
2022
The Journal of Artificial Intelligence Research
The multi-armed bandit is a reinforcement learning model where a learning agent repeatedly chooses an action (pull a bandit arm) and the environment responds with a stochastic outcome (reward) coming from ...
Each data owner has data associated to a bandit arm and the bandit algorithm has to sequentially select which data owner is solicited at each time step. ...
This work was mostly done while Radu Ciucanu, Gael Marcadet, and Marta Soare were affiliated with INSA Centre Val de Loire / Univ. Orléans / LIFO, France. ...
doi:10.1613/jair.1.13163
fatcat:gue2mjfjprgnjn6x73b2aaifku
Introduction to Multi-Armed Bandits
[article]
2024
arXiv
pre-print
Multi-armed bandits a simple but very powerful framework for algorithms that make decisions over time under uncertainty. ...
The next three chapters cover adversarial rewards, from the full-feedback version to adversarial bandits to extensions with linear rewards and combinatorially structured actions. ...
Consider an instance of multi-armed bandits with two arms and mean rewards µ 1 , µ 2 . ...
arXiv:1904.07272v8
fatcat:qmnt2ali3vbe3bnpcw5t5k3rp4
OL4EL: Online Learning for Edge-cloud Collaborative Learning on Heterogeneous Edges with Resource Constraints
[article]
2020
arXiv
pre-print
Then, we propose an Online Learning for EL (OL4EL) framework based on the budget-limited multi-armed bandit model. ...
Distributed machine learning (ML) at network edge is a promising paradigm that can preserve both network bandwidth and privacy of data providers. ...
bandit problem that seeks the optimal arm sequences to maximizes the average arm reward while keeping the total arm cost no more than given budgets. ...
arXiv:2004.10387v2
fatcat:wsptbqqi6jhwnkh5jezwsbx3by
Pervasive AI for IoT Applications: Resource-efficient Distributed Artificial Intelligence
[article]
2021
arXiv
pre-print
The confluence of pervasive computing and artificial intelligence, Pervasive AI, expanded the role of ubiquitous IoT systems from mainly data collection to executing distributed computations with a promising ...
This is driven by the easier access to sensory data and the enormous scale of pervasive/ubiquitous devices that generate zettabytes (ZB) of real-time data streams. ...
ACKNOWLEDGMENT This work was made possible by NPRP grant NPRP12S-0305-190231 and NPRP13S-0205-200265 from the Qatar National Research Fund (a member of Qatar Foundation). ...
arXiv:2105.01798v1
fatcat:4tnq2wjw4bcqdfvhnoij55s2rm
A Unified Approach to Translate Classical Bandit Algorithms to the Structured Bandit Setting
[article]
2021
arXiv
pre-print
We propose a novel approach to gradually estimate the hidden θ^* and use the estimate together with the mean reward functions to substantially reduce exploration of sub-optimal arms. ...
We consider a finite-armed structured bandit problem in which mean rewards of different arms are known functions of a common hidden parameter θ^*. ...
In this paper, we study a fundamental variant of classical multi-armed bandits called the structured multi-armed bandit problem, where mean rewards of the arms are functions of a hidden parameter θ. ...
arXiv:1810.08164v7
fatcat:ftxpgbxifbemfj2amojqcsthfy
(Private) Kernelized Bandits with Distributed Biased Feedback
[article]
2023
arXiv
pre-print
This problem is motivated by several real-world applications (such as dynamic pricing, cellular network configuration, and policy making), where users from a large population contribute to the reward of ...
In this paper, we study kernelized bandits with distributed biased feedback. ...
Figure 1 : 1 Figure 1: Dynamic pricing: a motivating application of our problem. ...
arXiv:2301.12061v2
fatcat:a4ormxs2ujfgniyltu6giunxdi
Beyond Ads: Sequential Decision-Making Algorithms in Law and Public Policy
[article]
2022
arXiv
pre-print
We explore the promises and challenges of employing sequential decision-making algorithms - such as bandits, reinforcement learning, and active learning - in law and public policy. ...
We hope our work inspires more investigation of sequential decision making in law and public policy, which provide unique challenges for machine learning researchers with tremendous potential for social ...
In multi-armed bandits in particular, concept drift is sometimes modelled as discrete, abrupt changes in the mean reward of the arms [4, 89] , a paradigm sometimes called switching bandits. ...
arXiv:2112.06833v2
fatcat:ofybdipdnjcmnpfqyeuiu75dpy
« Previous
Showing results 1 — 15 out of 248 results