Systematic evaluation of SURE, Tweedie, and bootstrap UQ across four SSL methods
Relative error of SURE estimates vs ground-truth MSE across noise levels.
MSE improvement of Tweedie posterior mean over pseudoinverse baseline.
2-sigma coverage across operator types and SSL methods.
| SSL Method | SURE Rel. Error | HO-SURE Rel. Error | Tweedie MSE | Bootstrap Coverage | Verdict |
|---|---|---|---|---|---|
| Noise2Self | 0.442 | 0.427 | 0.499 | 0.104 | Partial |
| EI | 0.456 | 0.391 | 0.628 | 0.313 | Partial |
| SSDU | 1.100 | 1.124 | 0.219 | 0.250 | Tweedie OK |
| Noisier2Noise | 0.550 | 0.532 | 0.472 | 0.250 | Partial |
Lowest SURE relative error among all methods due to J-invariant masking aligning with SURE assumptions.
Tweedie posterior mean achieves lowest MSE for SSDU, where data splitting enables better score estimation.
Equivariant bootstrapping coverage depends heavily on operator type, with Gaussian operators yielding best results.
Each UQ tool has strengths for specific SSL methods; practitioners should select based on the validity matrix.