Asymptotic Behavior of Standard Gradient Boosting Algorithms

Spectral filter analysis, three-regime structure, convergence studies, and asymptotic normality tests for gradient boosting and EBMs.

Spectral Filter
Three Regimes
Convergence
EBM Analysis
Normality

Effective Ridge Parameter vs. eta*T

Spectral Filter Matching

etaTeta*TBest Ridge muFilter Dist.
0.01100.11.3450.0181
0.011001.00.8790.0011
0.10505.00.1210.0105
0.1020020.00.0280.0155
As eta*T increases, the effective ridge parameter decreases monotonically, confirming that eta*T controls the implicit regularization strength of gradient boosting.

Three-Regime Structure: Ridge Parameter vs. eta*T

Filter Distance Across Regimes

The minimum filter distance occurs at eta*T ~ 1 (critical regime), where boosting most closely approximates ridge regression. Under-iterated boosting applies excessive smoothing; over-iterated boosting approaches interpolation.

Estimator Distances as n Grows

Convergence Summary

nKernel vs RidgeBoulevard vs RidgeStump vs Ridge
500.3590.4310.580
1000.3540.4250.575
2000.3650.4370.579
4000.3670.4400.580
With fixed (eta, T), distances remain stable rather than decreasing, suggesting that proper scaling of boosting parameters with n is required for convergence to the kernel ridge limit.

EBM Cyclic Boosting vs. Kernel Ridge Variants

EBM Distance Summary (d=3)

nEBM vs Add. RidgeEBM vs Full RidgeAdd. vs Full Ridge
500.3720.3230.179
1000.3890.3340.190
2000.3930.3370.184
3000.3830.3250.183
Additive vs. full kernel ridge distance (~0.18) reflects the structural constraint of additivity. EBM distances (~0.35) suggest that convergence to the additive kernel ridge limit requires appropriate hyperparameter scaling.

Standard Deviation Decay (sqrt(n) rate)

KS Test p-values

Asymptotic Normality Test Results

nMeanStdKS Statisticp-valueNormal?
50-0.0060.0560.0570.695Yes
1000.0040.0330.0330.996Yes
2000.0010.0210.0850.211Yes
400-0.0020.0150.0510.807Yes
All KS test p-values exceed 0.05, and the standard deviation decreases at the sqrt(n) rate (0.056 to 0.015), strongly supporting asymptotic normality of the gradient boosting estimator.