How do diversity and coordination operate within chain-of-thought reasoning? A computational study of implicit multi-perspective problem solving.
| Method | Mean Q | Best Q | Max Q | Cos. Div. | Strat. Cov. |
|---|---|---|---|---|---|
| Independent | 0.058 | 0.206 | 0.482 | 0.895 | 0.387 |
| Repulsive (RS) | 0.059 | 0.213 | 0.555 | 0.841 | 0.362 |
| Strategy-Cond. (SCG) | 0.062 | 0.216 | 0.490 | 0.918 | 0.794 |
| Ensemble (EC) | 0.058 | 0.209 | 0.452 | 0.807 | 0.375 |
Strategy coverage vs. best quality across methods. Higher right = better coordination.
| Problem Dimension | 20 |
| Number of Problems | 50 |
| Traces per Problem (K) | 8 |
| Trace Length | 12 steps |
| Solution Optima | 6 per problem |
| Coordination Rounds | 80 |
| Random Seed | 42 |
Each trace follows a biased random walk toward the nearest optimum. No coordination between traces.
An RBF diversity kernel repels each trace from previously generated traces, encouraging exploration of distinct regions. Repulsion strength: 0.5, bandwidth: 1.5.
Each trace is assigned a strategy label (round-robin) and biased toward the corresponding optimum, ensuring systematic exploration of all solution strategies.
A portfolio of 4 specialized sub-policies with learned directional biases. Sub-policies are updated based on strategy visit counts to maintain complementary specializations.
This work addresses the open question posed by Kim et al. (2026): "How diversity and coordination operate within the reasoning traces of LLMs remains an open question." Our computational framework provides quantitative evidence that structured diversity -- diversity organized around distinct solution strategies -- is more effective than geometric diversity for improving collective problem solving in multi-trace reasoning.