multi-armed banditsWe consider a multi-agent multi-armed bandit setting in which n honest agents collaborate over a network to minimize regret but m malicious agents can disrupt learning arbitrarily. Assuming the network is the complete graph, existing algorithms incur O((m + K/n) og (T) /...
Multi-Armed BanditsRisk-averseRisk measureEntropic Value-at-RiskAutism Spectrum DisorderSocial engagementThe stochastic multi-armed bandit problem is a standard model to solve the exploration-exploitation trade-off in sequential decision problems. In clinical trials, which are sensitive to outlier data, ...
Multi-armed banditscoherent risk measurecumulant generative functionconcentration of measureWe study a variant of the standard stochastic multi-armed bandit problem when one is not interested in the arm with the best mean, but instead in the arm maximizing some coherent risk measure......