Dynamic Privacy Pricing: A Multi-Armed Bandit Approach With Time-Variant Rewards.

In this article, we survey the principles and the latest research development of data pricing in machine learning pipelines. We start with a brief review of data marketplaces and pricing desiderata. ... As machine learning pipelines involve many parties and, in order to be successful, have to form a constructive and dynamic eco-system, marketplaces and data pricing are fundamental in connecting and facilitating ... They develop a multi-armed bandit algorithm to extend the DG model [20] , which dynamically adjusts the parameter β in Equation 6 . A larger β encourages more accurate labels but costs more money. ...

arXiv:2108.07915v1 fatcat:736zip2pbndupl7hixdbfz33om

Open Access

Dynamic pricing and learning is a research topic that has received a considerable amount of attention in recent years, from different scientific communities: operations research and management science, ... pricing, and provide an in-depth overview of the available literature on dynamic pricing and learning. ... This survey is a considerable extension of the literature review in chapter 2 of den Boer (2013) . ...

doi:10.1016/j.sorms.2015.03.001 fatcat:f226ingmdvelpd3ows57sp7ine

Dynamic pricing and learning is a research topic that has received a considerable amount of attention in recent years, from different scientific communities: operations research and management science, ... pricing, and provide an in-depth overview of the available literature on dynamic pricing and learning. ... This survey is a considerable extension of the literature review in chapter 2 of den Boer (2013) . ...

doi:10.2139/ssrn.2334429 fatcat:4zpajef5abhazaqc7jzicqj4ie

We thus propose a contextual dynamic pricing mechanism with the reserve price constraint, which features the properties of ellipsoid for efficient online optimization, and can support linear and non-linear ... market value models with uncertainty. ... In fact, the contextual dynamic pricing problem can also be modeled into a contextual multi-armed bandit (MAB), where the arms/actions to be exploited and explored are the domain of the weight vector. ...

arXiv:1911.12598v1 fatcat:xawjaohzrfghhcdl52npcn4feq

and the time-varying nature of privacy concerns. ... This work uses a mechanism design approach to study the optimal market model to economize the value of privacy of personal data, using differential privacy. ... Dynamic privacy pricing: A multi-armed bandit approach with time-variant rewards. ...

arXiv:2101.04357v2 fatcat:womp4nuzwzfnhkqcl4o25gagxa

Multiple Versions

In defining these methods, we model the appliances scheduling problem as a Multi-Armed Bandit (MAB) problem, a classical formulation of decision theory. ... In order to converge to the equilibrium of the game, we adopt an efficient learning algorithm proposed in the literature, Exp3, along with two variants that we propose to speed up convergence. ... Markov decision process multi-armed bandit problem. ...

doi:10.1109/onlinegreencom.2014.7114418 dblp:conf/onlinegreencomm/BarbatoCMP14 fatcat:dwyw7ezmobg7nmyhmsvrg4quzi

To balance exploration with exploitation, we pose automatic layout selection as a contextual bandit problem. There are many bandit algorithms, each generating a policy which must be evaluated. ... reward of using a layout we already know is effective (exploitation). ... -ngreedy(c,d) A variant of -greedy algorithm, where decreases with time. Parameters c and d control the speed of deceasing [3]. -first( ) ...

doi:10.1145/2505515.2514700 dblp:conf/cikm/TangRSA13 fatcat:pd5ypbxtanb23otit7teyikpci

Multi armed bandit models (MAB) as a type of adaptive optimization algorithms provide possible approaches for such purposes. ... Second, we compare the accumulative rewards of the three MAB algorithms with more than 1,000 trials using actual historical A/B test datasets. ... We thank Sarfaraz Hussein for providing the testing variants. We thank Priyanka Kommidi and Achal Dalal for the effort and support in the A/B test. ...

arXiv:2108.01440v2 fatcat:z5fgjpv2cfayjhfu4li4ns77ty

Multiple Versions

The multi-armed bandit is a reinforcement learning model where a learning agent repeatedly chooses an action (pull a bandit arm) and the environment responds with a stochastic outcome (reward) coming from ... Each data owner has data associated to a bandit arm and the bandit algorithm has to sequentially select which data owner is solicited at each time step. ... This work was mostly done while Radu Ciucanu, Gael Marcadet, and Marta Soare were affiliated with INSA Centre Val de Loire / Univ. Orléans / LIFO, France. ...

doi:10.1613/jair.1.13163 fatcat:gue2mjfjprgnjn6x73b2aaifku

DOAJ Szczepanski

Multi-armed bandits a simple but very powerful framework for algorithms that make decisions over time under uncertainty. ... The next three chapters cover adversarial rewards, from the full-feedback version to adversarial bandits to extensions with linear rewards and combinatorially structured actions. ... Consider an instance of multi-armed bandits with two arms and mean rewards µ 1 , µ 2 . ...

arXiv:1904.07272v8 fatcat:qmnt2ali3vbe3bnpcw5t5k3rp4

Multiple Versions

Then, we propose an Online Learning for EL (OL4EL) framework based on the budget-limited multi-armed bandit model. ... Distributed machine learning (ML) at network edge is a promising paradigm that can preserve both network bandwidth and privacy of data providers. ... bandit problem that seeks the optimal arm sequences to maximizes the average arm reward while keeping the total arm cost no more than given budgets. ...

arXiv:2004.10387v2 fatcat:wsptbqqi6jhwnkh5jezwsbx3by

Multiple Versions

The confluence of pervasive computing and artificial intelligence, Pervasive AI, expanded the role of ubiquitous IoT systems from mainly data collection to executing distributed computations with a promising ... This is driven by the easier access to sensory data and the enormous scale of pervasive/ubiquitous devices that generate zettabytes (ZB) of real-time data streams. ... ACKNOWLEDGMENT This work was made possible by NPRP grant NPRP12S-0305-190231 and NPRP13S-0205-200265 from the Qatar National Research Fund (a member of Qatar Foundation). ...

arXiv:2105.01798v1 fatcat:4tnq2wjw4bcqdfvhnoij55s2rm

Open Access Multiple Versions

We propose a novel approach to gradually estimate the hidden θ^* and use the estimate together with the mean reward functions to substantially reduce exploration of sub-optimal arms. ... We consider a finite-armed structured bandit problem in which mean rewards of different arms are known functions of a common hidden parameter θ^*. ... In this paper, we study a fundamental variant of classical multi-armed bandits called the structured multi-armed bandit problem, where mean rewards of the arms are functions of a hidden parameter θ. ...

arXiv:1810.08164v7 fatcat:ftxpgbxifbemfj2amojqcsthfy

Multiple Versions

This problem is motivated by several real-world applications (such as dynamic pricing, cellular network configuration, and policy making), where users from a large population contribute to the reward of ... In this paper, we study kernelized bandits with distributed biased feedback. ... Figure 1 : 1 Figure 1: Dynamic pricing: a motivating application of our problem. ...

arXiv:2301.12061v2 fatcat:a4ormxs2ujfgniyltu6giunxdi

Multiple Versions

We explore the promises and challenges of employing sequential decision-making algorithms - such as bandits, reinforcement learning, and active learning - in law and public policy. ... We hope our work inspires more investigation of sequential decision making in law and public policy, which provide unique challenges for machine learning researchers with tremendous potential for social ... In multi-armed bandits in particular, concept drift is sometimes modelled as discrete, abrupt changes in the mean reward of the arms [4, 89] , a paradigm sometimes called switching bandits. ...

arXiv:2112.06833v2 fatcat:ofybdipdnjcmnpfqyeuiu75dpy

Multiple Versions

Data Pricing in Machine Learning Pipelines [article]

Preserved Fulltext

Dynamic pricing and learning: Historical origins, current research, and new directions

Preserved Fulltext

Dynamic Pricing and Learning: Historical Origins, Current Research, and New Directions

Preserved Fulltext

Online Pricing with Reserve Price Constraint for Personal Data Markets [article]

Preserved Fulltext

On the Differential Private Data Market: Endogenous Evolution, Dynamic Pricing, and Incentive Compatibility [article]

Preserved Fulltext

Other Versions

A multi-armed bandit formulation for distributed appliances scheduling in smart grids

Preserved Fulltext

Automatic ad format selection via contextual bandits

Preserved Fulltext

Adaptively Optimize Content Recommendation Using Multi Armed Bandit Algorithms in E-commerce [article]

Preserved Fulltext

Other Versions

SAMBA: A Generic Framework for Secure Federated Multi-Armed Bandits

Preserved Fulltext

Introduction to Multi-Armed Bandits [article]

Preserved Fulltext

Other Versions

OL4EL: Online Learning for Edge-cloud Collaborative Learning on Heterogeneous Edges with Resource Constraints [article]

Preserved Fulltext

Other Versions

Pervasive AI for IoT Applications: Resource-efficient Distributed Artificial Intelligence [article]

Preserved Fulltext

A Unified Approach to Translate Classical Bandit Algorithms to the Structured Bandit Setting [article]

Preserved Fulltext

Other Versions

(Private) Kernelized Bandits with Distributed Biased Feedback [article]

Preserved Fulltext

Other Versions

Beyond Ads: Sequential Decision-Making Algorithms in Law and Public Policy [article]

Preserved Fulltext

Other Versions