Prevalence of Security Vulnerabilities in Agent Skills

Problem & Methods

Agent skills are modular packages containing SKILL.md instructions and optional bundled scripts distributed via public marketplaces. Despite rapid adoption, the security posture of these skill packages remains largely uncharacterized.

Simulation Framework

We model a marketplace of N = 2,500 agent skill packages. Each skill is characterized by its category, code complexity, number of bundled scripts, requested permissions, popularity tier, and vetting status. A simulated multi-layer vulnerability scanner evaluates each skill against 10 vulnerability classes. The simulation uses a fixed random seed for full reproducibility.

For each skill-vulnerability pair, the detection probability is computed as: p_v = min(0.95, r_v * m_{c,v} * log(complexity+1)/log(101) * (1 + 0.08 * n_perms) * f_vet), where r_v is the base rate for vulnerability class v, m_{c,v} is the category-specific multiplier, and f_vet is the vetting reduction factor (1.0 for unreviewed, 0.65 for auto-scanned, 0.30 for human-reviewed).

Interactive Results

Explore vulnerability prevalence across multiple dimensions: vulnerability classes, skill categories, vetting status, popularity tiers, and code complexity.

Prevalence by Vulnerability Class

Instance Count by Vulnerability Class

Prevalence by Skill Category

Mean Vulnerabilities per Skill by Category

Prevalence by Vetting Status

Vetting Pipeline Distribution

Prevalence by Popularity Tier

Prevalence by Code Complexity

Severity Distribution per Vulnerability Class

Top Vulnerability Co-occurrence Pairs (Conditional Probability)

Data Tables

Detailed numerical results from the simulation study.

Overall Vulnerability Prevalence (N = 2,500)

Metric	Value
Skills scanned	2,500
Vulnerable skills	1,899
Overall prevalence	0.7596
Critical prevalence	0.2748
High-or-critical prevalence	0.5216
Total vulnerabilities	3,863
Mean vulns per skill	1.5452
Mean vulns per vulnerable skill	2.0342

Prevalence and Severity by Vulnerability Class

Vulnerability Class	Prevalence	Critical	High	Medium	Low	Count
Missing input validation	0.2992	0.0481	0.1832	0.4398	0.3289	748
Excessive permissions	0.2932	0.1173	0.2606	0.3752	0.2469	733
Supply chain integrity	0.2044	0.2505	0.3190	0.3190	0.1115	511
Prompt injection	0.1680	0.2405	0.3810	0.2929	0.0857	420
Credential leakage	0.1636	0.3374	0.3227	0.2445	0.0954	409
Path traversal	0.1216	0.1842	0.3487	0.3026	0.1645	304
Data exfiltration	0.1196	0.3579	0.2408	0.2843	0.1171	299
Arbitrary code execution	0.0860	0.4186	0.3442	0.2093	0.0279	215
Dependency confusion	0.0572	0.3077	0.3217	0.2937	0.0769	143
Insecure deserialization	0.0324	0.3333	0.2593	0.2963	0.1111	81

Vulnerability Prevalence by Skill Category

Category	N	Vulnerable	Prevalence	Critical	Mean Vulns
Security tools	153	124	0.8105	0.3203	1.7386
System admin	292	233	0.7979	0.3185	1.8014
Web automation	361	284	0.7867	0.2659	1.6205
Data analysis	408	314	0.7696	0.2794	1.6152
File management	243	183	0.7531	0.2551	1.4897
Misc	232	172	0.7414	0.2457	1.4828
Communication	247	183	0.7409	0.2794	1.4170
Coding	564	406	0.7199	0.2606	1.3670

Vulnerability Prevalence by Vetting Status

Vetting Status	N	Prevalence	Critical
Unreviewed	1,386	0.8586	0.3341
Auto-scanned	771	0.7302	0.2374
Human-reviewed	343	0.4257	0.1195

Vulnerability Prevalence by Popularity Tier

Popularity	N	Prevalence	Critical
Low	1,403	0.8076	0.2937
Medium	698	0.7249	0.2564
High	272	0.6691	0.2537
Very High	127	0.6142	0.2126

Vulnerability Prevalence by Code Complexity

Complexity Tier	N	Prevalence
Tiny (<50 lines)	774	0.6370
Small (50-200)	1,124	0.7891
Medium (200-500)	391	0.8568
Large (500-2000)	194	0.8608
Very Large (2000+)	17	1.0000

Key Findings

Primary findings from the simulation-based measurement study of agent skill security.

High Overall Prevalence

75.96% of all skills contain at least one vulnerability, with a mean of 1.5452 vulnerabilities per skill. This is substantially worse than mature package ecosystems such as npm (10-15%).

Critical Severity Concentration

27.48% of skills contain critical-severity vulnerabilities, and 52.16% contain high or critical issues. Arbitrary code execution has the highest critical rate at 41.86%.

Dominant Vulnerability Classes

Missing input validation (29.92% prevalence) and excessive permissions (29.32%) are the most common. Supply chain integrity gaps affect 20.44% of skills.

Category Risk Variation

Security tools (81.05%) and system administration (79.79%) skills are the most vulnerable. Paradoxically, security tools have the highest vulnerability rate in the ecosystem.

Vetting Effectiveness

Human-reviewed skills show 42.57% prevalence vs. 85.86% for unreviewed -- a 43.29 percentage-point absolute reduction. However, only 13.7% of marketplace skills have human review.

Complexity-Prevalence Gradient

Prevalence increases from 63.70% for tiny skills (<50 lines) to 86.08% for large skills (500-2000 lines), and 100% for very large skills (2000+ lines).