Factorial analysis disentangling long-context processing from theory-of-mind reasoning demands.
cs.AI Theory of Mind BenchmarksTheory-of-mind order explains nearly 4x more variance than context length.
Higher-order ToM questions show steeper context-length degradation.
Marginal accuracy by context length (left) and ToM order (right).
| Factor | % Variance | Std Dev |
|---|---|---|
| ToM Order | 74.9% | 1.5% |
| Context Length | 19.4% | 0.4% |
| Interaction | 1.0% | 1.1% |
| Residual | 4.7% |