Adaptive Weighted Algorithm for Non-Stationary Linear Bandits
Optimal dynamic regret without prior knowledge of path length P_T
Simulation Parameters
Path Length (P_T):
5.0
Time Horizon (T):
2000
Dynamic Regret Comparison
Effective Discount Factor Over Time
Regret Scaling with T
Algorithm Performance by Path Length