Interactive exploration of why LLMs outperform humans at recovering meaning from Jabberwockified text. Based on Lupyan et al. (2026, arXiv:2601.11432).
| LLM | Pearson r | p-value | Kendall τ | p-value | Interpretation |
|---|---|---|---|---|---|
| GPT-4 | 0.807 | 0.052 | 0.600 | 0.136 | Strong positive |
| Claude | 0.853 | 0.031 | 0.467 | 0.272 | Strong positive (sig.) |
| LLaMA-70B | 0.813 | 0.049 | 0.200 | 0.719 | Strong positive (sig.) |
| LLaMA-7B | 0.985 | <0.001 | 1.000 | 0.003 | Near-perfect (sig.) |