HDI Lab Seminar: Small Total-Cost Constraints in Contextual Bandits with Knapsacks
Evgenii Chzhen, CNRS, Université Paris-Saclay, will talk at HDI Lab Seminar on November 7 at 2:40pm.
I will talk about some recent developments in the literature of contextual bandit problems with knapsacks [CBwK], a problem where at each round, a scalar reward is obtained and vector-valued costs are suffered. The goal is to maximize the cumulative rewards while ensuring that the cumulative costs are lower than some predetermined cost constraints. In this setting, total cost constraints had so far to be at least of order T^{3/4} where T is the number of rounds, and were even typically assumed to depend linearly on T. Elaborating on the main technical challenge and drawback of the previous approaches, I will present a dual strategy based on projected-gradient-descent updates, that is able to deal with total-cost constraints of the order of T^{1/2} up to poly-logarithmic terms. This strategy is direct, and it relies on a careful, adaptive, tuning of the step size. The approach is inspired by a parameter-free-type algorithms arising from convex (online) optimization literature, which I also briefly review.
The talk is based on joint works with C. Giraud, Z. Li, and G. Stoltz.
To get a link to the online meeting, please contact Elena Alyamovskaya, ealyamovskaya@hse.ru.