Investigating how reasoning and generative capabilities of LLMs translate when producing judgments and decisions intended to resemble human choices.
cs.AI Behavioral Economics Decision ScienceNon-monotonic inverted-U curve: alignment peaks at intermediate reasoning, degrades at extremes.
Weak positive relationship: higher fluency slightly increases divergence from human patterns.
Different decision tasks show varying sensitivity to reasoning depth.
Key numerical results from the capability sweep experiments.
| Metric | Value |
|---|---|
| Best Reasoning Level (r*) | 0.50 |
| JSD at r* | 0.065 |
| JSD at r=0.1 (low) | 0.147 |
| JSD at r=1.0 (high) | 0.111 |
| Peak Decision Consistency | 0.809 |
| Reasoning-JSD Correlation | 0.605 |
| Fluency-JSD Correlation | 0.512 |
Fraction of matching binary decisions between LLM and human baselines.
Main conclusions from the computational analysis.