Generative Speech Enhancement for In-the-Wild TTS Dataset Curation
Evaluating GSE Performance Beyond Curated Corpora | Yamauchi et al., arXiv:2601.12254
+0.46
PESQ Gain (In-the-Wild)
15.0%
Hallucination Rate (Wild)
8.6%
Hallucination Rate (Curated)
28%
Retention at Thresh 0.7
5 dB
Min SNR for Reliable GSE
Enhancement Quality (PESQ) by Condition and Model
Hallucination Rates
Confidence Filtering Trade-off
Threshold:
0.5
SNR Dependence (In-the-Wild, Generative SE)
Quality Summary Table
Condition
Raw PESQ
GSE PESQ
Improvement
Halluc. Rate
Curated
3.60
3.81
+0.21
8.6%
Semi-Wild
2.43
2.87
+0.44
11.0%
In-the-Wild
1.38
1.84
+0.46
15.0%