The quality layer for AI skills, agents, and workflows.

Audit any skill, agent, or tool setup for quality, efficiency, and safety. One artifact or the whole repo. Same input, same score.

audit / sample-skill

EXCELLENT

Overall score

8 dimensions · 99 checks · repeatable

100 − 4(1H) − 5(2M) = 91

1 High2 Medium

1/11 skills missing allowed-tools field

Token bloat in references/ (~400 tokens)

Outdated tool definition in 2 skills

Registries list what exists. Stars rank popularity. Scanners check for malware.
Nobody checks whether the thing is actually well built.

We're the layer that does.

01For teams & internal builders

Make sure your setup performs.

Treat your AI stack like the rest of your code — reviewed, measured, repeatable.

findingsyour-repo/

broken ref: scripts/foo.py

4/11 skills missing allowed-tools

−1,540 tokens redundant

Broken pathsToken wastePrivate by default

Audit your setup

02For creators sharing skills and agents

Ship something you're proud of.

Pre-publish review, a verified badge, and a public profile once you're ready.

Ready to ship

agentquality.io/b/your-skill

![Agent Quality](…/b/your-skill.svg)

Verified badgeRe-audit on releaseNo repo? Upload a zip

Audit before you publish

03For people installing what they didn't write

Check before you trust.

You can't verify every shared skill by hand. We do.

Buyer's guide

Good for

solo devs · prototyping

Not for

team infra · prod

Rough

3 broken refs · token bloat

Risk surfaceBuyer's GuideBrowse the directory

Browse the directory

Four questions. Every audit answers all of them.

99 checks. 8 dimensions. Repeatable.

VALID

Is it valid?

We check: Structure, metadata, spec compliance — name ≤ 64 chars, description ≤ 1024 chars, required fields present.

›Why it matters

Progressive disclosure means only frontmatter loads initially⁴. Get it wrong and Claude Code may never discover or load the skill. Hours of work that never fire.

EFFICIENT

Is it efficient?

We check: Token cost, redundancy, bloat, reference hygiene.

›Why it matters

Every byte loaded burns context and money. Bloated skills crowd out the real work.

SAFE

Is it safe?

We check: Permissions declared, tools justified, risk surface visible, no unexpected access.

›Why it matters

Snyk found prompt injection in 36% of audited skills and 1,467 malicious payloads in supply-chain studies³. This matters.

SOLID

Is it solid?

We check: Broken file paths, missing dependencies, dead references.

›Why it matters

The #1 silent failure: a skill references scripts/foo.py that doesn't exist. Looks fine until it runs.

How it works

Standardized. Repeatable. Explained.

Submit

Paste a GitHub URL, a single skill file, a repo path, or upload a zip.

Review

Checks run per-element across every skill, agent, and script — not a single blanket pass.

Score

0–100 with a breakdown, findings pinned to where they live, and a Buyer's Guide verdict.

Paid audits include ready-to-paste fix prompts you can hand straight to Claude Code.

Your whole repo, mapped

Every skill, every tool, every external link — one frame.

See risk, redundancy, and broken paths at a glance.

architecture · sample-repo

live

Simple

Workflow

tool useexternal call

Hover any node to see its connections

Why re-audit

Even a great skill built today gets stale fast.

Claude Code ships new features every few weeks⁵. Skills that don't adapt become the slow, expensive, deprecated way.

New capabilities

Features and patterns your skill should start using.

Deprecations

Outdated tool definitions, legacy fields, old patterns.

Claim drift

What your skill says vs. what the platform now supports.

Every audit ends with a verdict you can act on.

No user reviews. No upvotes. A Buyer's Guide derived from the audit itself.

buyer's guide / sample-skill

Good for you if

rapid prototyping · solo devs · Python codebases

Not for you if

production pipelines · multi-language repos · team-shared infra

Where it's rough

broken file refs (3) · undeclared tools on 4/11 skills · token bloat in 2 files

And one more thing

Honesty.

Five-star READMEs. “Miracle” tools with no substance. Every claim gets checked against the code.

Verified

Claim supported by code evidence

Outdated

Was true, no longer accurate

Aspirational

Partially true, overstated

Misleading

Contradicted by evidence

Marketing

No evidence either way

Make sure your AI tools are actually ready.

One audit, under a minute. Specific issues, pointed fixes.

Free during closed beta · GitHub login

Sources

The quality layer for AI skills, agents, and workflows.

Registries list what exists. Stars rank popularity. Scanners check for malware.Nobody checks whether the thing is actually well built.

Make sure your setup performs.

Ship something you're proud of.

Check before you trust.

Four questions. Every audit answers all of them.

Is it valid?

Is it efficient?

Is it safe?

Is it solid?

How it works

Submit

Review

Score

Every skill, every tool, every external link — one frame.

Even a great skill built today gets stale fast.

Every audit ends with a verdict you can act on.

Honesty.

Make sure your AI tools are actually ready.

Registries list what exists. Stars rank popularity. Scanners check for malware.
Nobody checks whether the thing is actually well built.