Long-Horizon Video Generation for Robotics

Hierarchical Temporal Diffusion with Coherence Anchoring (HTDCA) for minutes-long video generation with sustained temporal coherence.

0.911
HTDCA Quality @ 1024 frames
vs 0.175 Direct
0%
HTDCA Artifact Rate
vs 15.1% Stitching
2.3x
Memory Quality Boost
0.39 to 0.92 @ 512 frames
1024
Max Frames Tested
~34 seconds @ 30fps

Quality vs Sequence Length

Artifact Rate vs Sequence Length

Memory Ablation (512 frames)

Task Complexity Scaling

Full Results Table

FramesMethodQualityConsistencyArtifact %