Unified Memory Management Without External Expert Models

0.682

Unified Policy TSR

2.1x

Cost Reduction

0.744

Memory Coherence

+7.4%

TSR Improvement

Unified policy achieves 0.682 TSR vs 0.635 for external expert (7.4% improvement).
Inference cost reduced from 154.3 to 73.5 FLOPs units (2.1x reduction).
Unified policy converges fastest in training (final loss 0.010).
Advantage increases with conversation length due to better long-horizon memory management.
Joint optimization of memory and task execution enables end-to-end deployment.