FLAIR Seminar: Solving pure exploration problems with the Top Two approach

Abstract

In pure exploration problems for stochastic multi-armed bandits, the goal is to answer a question about a set of unknown distributions (modeling for example the efficacy of a treatment) from which we can collect samples (measure its effect), and to provide guarantees on the candidate answer. The archetypal example is the best arm identification problem, in which the agent aims at identifying the arm with the highest mean. In this talk, I will focus on the class of Top Two algorithms, which select the next arm to sample from among two candidate arms, a leader and a challenger. Due to their simplicity and interpretability, Top Two algorithms have received increased attention in recent years. In the fixed-confidence setting, Top Two algorithms have an asymptotically optimal expected sample complexity (number of collected samples when the error level vanishes). In the anytime setting, we propose a Top Two algorithm which has guarantees on the probability of misidentifying a good enough arm at any time.

Date

Feb 19, 2024 1:15 PM

Event

FLAIR Seminar

Location

EPFL

Lausanne

FLAIR Seminar: Solving pure exploration problems with the Top Two approach

Abstract

Marc Jourdan

Post-Doctoral Researcher