Verifiable Tool Action in AI Agents

Contract-based pre-execution validation for tool calls: safety-latency tradeoffs across verification strategies.

cs.AI Agent Safety Tool Verification

91.9%

Combined Safety Rate

98.4%

Formal Safety Rate

286ms

Combined Latency

4,000

Tool Calls Tested

Safety Rate by Strategy

Comparison of verification approaches from no checking to full formal verification.

Higher safety costs more latency; cascaded approach provides the best balance.

Detection quality across verification strategies.

Strategy	Safety	Precision	Recall	F1	Latency
None	76.6%	0.000	0.000	0.000	0ms
Schema	87.9%	0.939	0.531	0.668	2ms
Semantic	90.2%	0.829	0.728	0.771	90ms
Formal	98.4%	0.990	0.940	0.963	625ms
Combined	91.9%	0.858	0.808	0.829	286ms