Contract-based pre-execution validation for tool calls: safety-latency tradeoffs across verification strategies.
cs.AI Agent Safety Tool VerificationComparison of verification approaches from no checking to full formal verification.
Higher safety costs more latency; cascaded approach provides the best balance.
Detection quality across verification strategies.
| Strategy | Safety | Precision | Recall | F1 | Latency |
|---|---|---|---|---|---|
| None | 76.6% | 0.000 | 0.000 | 0.000 | 0ms |
| Schema | 87.9% | 0.939 | 0.531 | 0.668 | 2ms |
| Semantic | 90.2% | 0.829 | 0.728 | 0.771 | 90ms |
| Formal | 98.4% | 0.990 | 0.940 | 0.963 | 625ms |
| Combined | 91.9% | 0.858 | 0.808 | 0.829 | 286ms |