In this paper, we study the combinatorial multi-armed bandit problem (CMAB)\nwith probabilistically triggered arms (PTAs). Under the assumption that the arm\ntriggering probabilities (ATPs) are positive for all arms, we prove that a\nclass of upper confidence bound (UCB) policies, named ...
In this paper, we study the stochastic combinatorial multi-armed bandit (CMAB) framework that allows a general nonlinear reward function, whose expected value may not depend only on the means of the input random variables but possibly on the entire distributions of these variables. Our framew...
We investigate the combinatorial multi-armed bandit problem where an action is to select k arms from a set of base arms, and its reward is the maximum of the sample values of these k arms, under a weak feedback structure that only returns the value and...
We define a general framework for a large class of combinatorial multi-armed bandit (CMAB) problems, where simple arms with unknown istributions form . In each round, a super arm is played and the outcomes of its related simple arms are observed, which helps the selection of super arms in...
Learning andSelectingUsers forAchievingReliability in DemandResponse : AMulti-armedBanditApproach One challenge in the optimization and control of societal systems is to handle the unknown and uncertain user behavior. This paper focuses on residential d... Y Li,Q Hu,N Li 被引量: 0发表: 2019年...
We formulate the following combinatorial multi-armed bandit (MAB) problem: There are $N$ random variables with unknown mean that are each instantiated in an i.i.d. fashion over time. At each time multiple random variables can be selected, subject to an arbitrary constraint on weights associated...
Top-$k$ Combinatorial Bandits generalize multi-armed bandits, where at each round any subset of $k$ out of $n$ arms may be chosen and the sum of the rewards is gained. We address the full-bandit feedback, in which the agent observes only the sum of rewards, in contrast to the semi...
I want to write my thoughts of the paper [Combinatorial Bandits] by Nicolo Cesa-Bianchi and Gabor Lugosi in 2011. The first author is a great professor in this area. His paper of [Finite-time analysis of the multi-armed bandit problem] in 2002 as second author has been cited over 1,50...
Combinatorial OptimizationMulti-Armed BanditMixed-Integer ProgrammingWe study dynamic decision-making under uncertainty when, at each period, the decision maker faces a different instance of a combinatorial optimization problem.doi:10.2139/ssrn.3041893Sajad ModaresiDenis Saure...
We define a general framework for a large class of combinatorial multi-armed bandit (CMAB) problems, where simple arms with unknown distributions form super arms. In each round, a super arm is played and the outcomes of its related simple arms are observed, which helps t...