机器学习与数据科学博士生系列论坛(第八十期)——Bandit with Knapsacks
Speaker(s):孙谌劼 (北京大学)
Time:2024-11-28 16:00-17:00
Venue:腾讯会议 568-7810-5726
摘要:
Multi-armed bandit problems are the predominant theoretical model of exploration-exploitation tradeoffs in learning. A new variant of this model, called bandits with knapsacks, combines bandit learning with aspects of stochastic integer programming. In this model, the learner is constrained by budget limits, in addition to the customary limitation on the time horizon, which is more applicable to real-world scenarios.
In this talk, I will introduce the setup of the model, and existing algorithms that approach the theoretical optimum in both stochastic and adversarial settings. The discussion is based on the works of Ashwinkumar Badanidiyuru, Robert Kleinberg, and Aleksandrs Slivkins (JACM 2018), as well as Immorlica et al. (JACM 2022).
论坛简介:该线上论坛是由张志华教授机器学习实验室组织,每两周主办一次(除了公共假期)。论坛每次邀请一位博士生就某个前沿课题做较为系统深入的介绍,主题包括但不限于机器学习、高维统计学、运筹优化和理论计算机科学。