Generative Speech Enhancement for In-the-Wild TTS Dataset Curation

Evaluating GSE Performance Beyond Curated Corpora | Yamauchi et al., arXiv:2601.12254

+0.46
PESQ Gain (In-the-Wild)
15.0%
Hallucination Rate (Wild)
8.6%
Hallucination Rate (Curated)
28%
Retention at Thresh 0.7
5 dB
Min SNR for Reliable GSE

Enhancement Quality (PESQ) by Condition and Model

Hallucination Rates

Confidence Filtering Trade-off

SNR Dependence (In-the-Wild, Generative SE)

Quality Summary Table

ConditionRaw PESQGSE PESQImprovementHalluc. Rate
Curated3.603.81+0.218.6%
Semi-Wild2.432.87+0.4411.0%
In-the-Wild1.381.84+0.4615.0%