Measuring and improving LLM explanation stability under diverse environmental conditions for autonomous driving systems
Achieving consistent real-world interpretability under diverse environmental conditions remains an open research challenge for the broader LLM ecosystem (Ferrag et al., 2026, AgentDrive). The ECIC framework addresses this by formalizing interpretability consistency through four complementary metrics: Attribution Invariance Score (AIS), Explanation Semantic Similarity (ESS), Faithfulness Gap (FG), and a composite Consistency Index (CI).
CI as a function of visibility distance (10m-1000m). Select a model configuration:
CI as a function of precipitation intensity (0.0-1.0).
All 50 evaluations (5 scenarios x 10 condition pairs) pass all three contrastive checks: rationale stability, adjustment coherence, and attribution proportionality.
| Configuration | CI | AIS | ESS | FG (lower=better) | DCR |
|---|
| Condition | Visibility (m) | Precipitation | Light | Friction | Severity |
|---|