Verifiable Tool Action in AI Agents

Contract-based pre-execution validation for tool calls: safety-latency tradeoffs across verification strategies.

cs.AI Agent Safety Tool Verification
91.9%
Combined Safety Rate
98.4%
Formal Safety Rate
286ms
Combined Latency
4,000
Tool Calls Tested

Safety Rate by Strategy

Comparison of verification approaches from no checking to full formal verification.

Safety vs Latency Tradeoff

Higher safety costs more latency; cascaded approach provides the best balance.

Precision, Recall, F1

Detection quality across verification strategies.

Results Table

StrategySafetyPrecisionRecallF1Latency
None76.6%0.0000.0000.0000ms
Schema87.9%0.9390.5310.6682ms
Semantic90.2%0.8290.7280.77190ms
Formal98.4%0.9900.9400.963625ms
Combined91.9%0.8580.8080.829286ms