A reusable validation framework comparing ABM-generated distributions to empirical reference data across contact network structure, CT process parameters, and aggregate outcomes.
Agent-based models (ABMs) of epidemic contact tracing rely on synthetic populations and assumed operational parameters, yet their CT processes are rarely validated against real-world epidemiological data. This framework addresses the open problem identified by Chae et al. (2026), who acknowledged that their ABM simulations could not be quantitatively validated against actual CT logs.
Level 1 -- Contact Network Structure: Daily contact degree distributions and age-mixing matrices compared against POLYMOD survey data (Mossong et al., 2008).
Level 2 -- CT Process Parameters: Notification delays, recall probabilities, and contacts per interview compared against KDCA and CDC operational data.
Level 3 -- Aggregate CT Outcomes: Overall fraction of contacts traced and epidemic trajectory metrics.
Statistical Tests: Kolmogorov-Smirnov (KS) statistic, Jensen-Shannon (JS) divergence, and Earth Mover Distance (EMD) with 95% bootstrap confidence intervals (1,000 resamples).
Cosine similarity: 0.9856 | RMSE: 0.55
| Level | Distribution | KS Stat | JS Div | EMD | EMD CI Low | EMD CI High | Status |
|---|---|---|---|---|---|---|---|
| 1 | Daily contacts | 0.0809 | 0.0044 | 2.3547 | 1.8996 | 2.8262 | PASS |
| 2 | Notification delay | 0.0382 | 0.0045 | 0.1128 | 0.0914 | 0.1363 | PASS |
| 2 | Contacts per interview | 0.3933 | 0.1948 | 4.3947 | 4.2387 | 4.5213 | FAIL |
| 2 | Recall probability | 0.1924 | 0.1475 | 0.1013 | 0.0960 | 0.1062 | FAIL |
| 3 | Traced fraction | 1.0 | 0.6931 | 3.6296 | 3.6267 | 3.6327 | FAIL |
| Distribution | Source | n | Mean | Median | Std Dev | Min | Max |
|---|---|---|---|---|---|---|---|
| Daily contacts | POLYMOD (NegBin, mean=13.4, disp=0.5) | 5,000 | 13.11 | 6.0 | 18.71 | 0 | 216 |
| Notification delay | KDCA (Gamma, shape=2.5, scale=0.6) | 5,000 | 1.50 | 1.29 | 0.97 | 0.02 | 8.13 |
| Contacts per interview | CDC (Poisson, lambda=5.0) | 5,000 | 5.00 | 5.0 | 2.26 | 0 | 14 |
| Recall probability | Bi et al. (Beta, a=6, b=4) | 5,000 | 0.60 | 0.61 | 0.15 | 0.14 | 0.97 |
| Traced fraction | Park et al. (Beta, a=12, b=7) | 5,000 | 0.63 | 0.64 | 0.11 | 0.22 | 0.90 |
| Parameter | Value | Description |
|---|---|---|
| Population size | 10,000 | Number of simulated agents |
| Mean contacts | 12.0 | Mean daily contacts per agent |
| Contact dispersion | 0.45 | Negative binomial overdispersion |
| Recall rate | 0.55 | Probability of recalling a contact |
| Notification delay | Gamma(2.0, 0.8) | Days from confirmation to notification |
| Tracing success | 0.70 | Probability a notified contact is reached |
| R0 | 2.5 | Basic reproduction number |
| Infectious period | 7.0 days | Mean infectious period |
| Simulation days | 90 | Duration of simulation |
| Total infections | 10,000 | Observed epidemic outcome |
| Peak daily cases | 1,332 | Maximum daily new infections |