Adaptive Weighted Linear Bandits

Simulation Parameters

Path Length (P_T):

5.0

Time Horizon (T):

2000

Dynamic Regret Comparison

Effective Discount Factor Over Time

Regret Scaling with T

Algorithm Performance by Path Length