CV

Hierarchical Physics-Constrained Learning for Physically Consistent Geometry at Scale

Bridging classical geometric consistency with learned robustness through differentiable pose graph optimization and chunked attention

Open Problem

Learning physically consistent geometry at scale remains a challenging open problem (Xu et al., GPA-VGGT 2026). Without structured constraints, learned predictions suffer from scale drift, inconsistent geometry across viewpoints, and violation of physical laws. This work resolves the tension by making classical constraints differentiable and embedding them as hierarchical losses within a learning framework.

40.0%
Translation Error Reduction
Full model vs Baseline
14.4x
Scale Drift Reduction
Hierarchical anchoring
47.1x
Computational Speedup
Chunked vs Global at N=1000
~60%
Gravity Misalignment Reduction
Self-supervised enforcement

Three-Tier Hierarchical Framework

Tier 1

Local: Epipolar Consistency

Symmetric Sampson distance between frame pairs ensures predicted depth and pose are mutually consistent. Largest single contribution: removing it causes +29.0% translation error increase.

Tier 2

Window: SE(3) Composition Closure + Scale Consistency

Penalizes deviation from identity for pose cycles and enforces depth scale agreement across overlapping windows via log-ratio loss.

Tier 3

Global: Gravity Alignment + Ground Plane

Self-supervised enforcement of consistent gravity direction and coplanar ground points. Reduces gravity misalignment by ~60% without ground-truth gravity.

Pose Graph Optimization: Translation Error

Pose Graph Optimization: Rotation Error

Scale Drift Analysis

Loss Component Ablation

Scalability: Chunked vs Global Attention

Gravity Coherence

Window Configuration Analysis

Pose Graph Results

NoiseNo OptimizationSeq. OnlyWith Loops
Trans(m)Rot(deg)TransRotTransRot
0.010.401.410.401.410.341.18
0.051.096.801.096.800.946.27
0.101.5217.681.5217.681.3214.02
0.204.1947.384.1947.383.3440.64
0.304.6237.674.6237.674.2130.63

Loss Ablation Results

ConfigTrans(m)Rot(deg)Scale CVGrav
Full (Ours)0.3593.467.8e-50.972
No Epipolar0.4633.9310.0e-50.971
No Composition0.4113.708.9e-50.971
No Gravity0.3793.568.3e-50.971
No Scale0.4113.708.9e-50.971
No Ground0.3693.518.1e-50.972
Baseline0.5984.4612.6e-50.970