Minimax Dynamic Regret Under Time-Varying Arm Sets
Non-stationary Linear Bandits | Wang et al., arXiv:2601.01069
0.877
Weighted LS Exponent
0.858
Restarting Exponent
0.667
Theoretical Optimal
d=5
Feature Dimension
10K
Max Horizon
Regret Scaling with Horizon (Log-Log)
Arm Variation Impact (T=1000)
Non-stationarity Budget
Scaling Exponents
Experiment Summary
Algorithm
Exponent
R²
Type
Weighted LS
0.877
1.000
Adaptive
Sliding Window
0.877
1.000
Adaptive
Restarting
0.858
0.999
Adaptive
MASTER
0.878
1.000
Static