Systematic computational study of continuous symmetries, solution manifold geometry, sparsity, and algorithmic multiplicity in transformer models.
| Config (d, L, H) | Total Params | QK Sym | VO Sym | MLP Sym | Total Sym | Ratio |
|---|---|---|---|---|---|---|
| d=8, L=1, H=1 | 800 | 64 | 64 | 32 | 160 | 0.200 |
| d=8, L=1, H=2 | 800 | 32 | 32 | 32 | 96 | 0.120 |
| d=16, L=1, H=2 | 3,200 | 128 | 128 | 64 | 320 | 0.100 |
| d=16, L=2, H=2 | 6,272 | 256 | 256 | 128 | 640 | 0.102 |
| d=32, L=2, H=4 | 25,088 | 512 | 512 | 256 | 1,280 | 0.051 |
| d=64, L=4, H=8 | 198,656 | 2,048 | 2,048 | 1,024 | 5,120 | 0.026 |
| d=128, L=6, H=8 | 1,187,840 | 12,288 | 12,288 | 3,072 | 27,648 | 0.023 |
| d=512, L=12, H=8 | 37,879,808 | 393,216 | 393,216 | 24,576 | 811,008 | 0.021 |
| Task | Converged | Total Params | Mean Null Dim | Std | Upper Bound |
|---|---|---|---|---|---|
| Copy-Last (V=2,T=3) | 8/8 | 3,136 | 4.0 | 0.0 | 320 |
| XOR (V=2,T=3) | 8/8 | 3,136 | 4.0 | 0.0 | 320 |
| Copy-Last (V=2,T=4) | 8/8 | 3,136 | 16.1 | 0.33 | 320 |
| Copy-Last (V=3,T=2) | 8/8 | 3,168 | 0.0 | 0.0 | 320 |