0.55-0.83
NN alpha_c Range
Forgetting vs. Mixing Fraction (alpha)
Adaptation vs. Mixing Fraction (alpha)
Critical alpha vs. Domain Divergence (Analytical)
Critical alpha vs. Model Size
Neural Network Forgetting Across Domain Similarities
| cos_sim | alpha=0.0 | alpha=0.2 | alpha=0.4 | alpha=0.6 | alpha=0.8 | alpha=1.0 |
| 0.9 | 0.067 | 0.039 | 0.016 | 0.000 | 0.000 | 0.000 |
| 0.7 | 0.236 | 0.149 | 0.080 | 0.031 | 0.000 | 0.000 |
| 0.5 | 0.411 | 0.265 | 0.147 | 0.063 | 0.005 | 0.000 |
| 0.3 | 0.590 | 0.384 | 0.216 | 0.095 | 0.014 | 0.000 |
| 0.1 | 0.771 | 0.504 | 0.286 | 0.129 | 0.023 | 0.000 |
Model Size Scaling (cos_sim=0.5)
| Architecture | Params | alpha_c | Sharpness |
| [16] | 353 | 0.546 | 24.2 |
| [32] | 705 | 0.687 | 43.3 |
| [64] | 1,409 | 0.781 | 89.3 |
| [128] | 2,817 | 0.765 | 137.3 |
| [64,64] | 5,569 | 0.781 | 124.2 |
| [128,128] | 19,329 | 0.828 | 238.9 |