All posts
AI Tools 11 min read May 8, 2026

Agent Skills: Production-Grade Workflows for AI Coding Agents

A practical look at Addy Osmani's agent-skills project — 20 reusable engineering skills, 7 lifecycle commands, agent personas, and quality gates for making AI coding agents behave more like senior engineers.

#Agent Skills#AI Coding Agents#Claude Code#Developer Tools#Software Engineering#Code Review#Testing#Open Source
Neel Shah Tech Lead · Senior Data Engineer · Ottawa

AI coding agents are getting fast enough that the hard part is no longer “can the model write code?” The harder question is whether the agent follows the engineering discipline that keeps a real codebase healthy: write a spec, break the work down, build in slices, test what changed, review the result, and ship with evidence.

agent-skills by Addy Osmani is an open-source attempt to package that discipline into reusable workflows. The repository describes itself as production-grade engineering skills for AI coding agents. In practice, it is a set of Markdown-based skills, slash commands, personas, hooks, and checklists that make agents behave less like autocomplete and more like a careful teammate.

The important idea is simple: prompts are not enough. Agents need repeatable process.


What Agent Skills is

Agent Skills is a workflow pack for AI coding agents. It gives an agent structured instructions for the full software delivery loop:

Define -> Plan -> Build -> Verify -> Review -> Ship

The project maps that loop to 7 slash commands:

StageCommandPurpose
Define/specTurn an idea into a concrete specification
Plan/planBreak work into small, verifiable tasks
Build/buildImplement one useful slice at a time
Verify/testProve the change works
Review/reviewCheck quality before merge
Simplify/code-simplifyReduce unnecessary complexity
Ship/shipPrepare for production release

Under those commands are 20 skills. Each skill is a focused workflow with steps, triggers, red flags, rationalization checks, and verification requirements. That matters because agent failures are often process failures: the agent skips the spec, writes too much at once, forgets tests, rationalizes a flaky build, or declares success without evidence.

Agent Skills tries to stop those failure modes at the instruction layer.


The 20 skills are the real product

The slash commands are convenient entry points, but the skill library is where the value is.

The skills cover the normal lifecycle of production engineering:

AreaSkills
Defineidea refinement, spec-driven development
Planplanning and task breakdown
Buildincremental implementation, test-driven development, context engineering, source-driven development, frontend UI engineering, API/interface design
Verifybrowser testing with DevTools, debugging and error recovery
Reviewcode review and quality, code simplification, security and hardening, performance optimization
Shipgit workflow, CI/CD automation, deprecation and migration, documentation and ADRs, shipping and launch
Metausing agent skills

This is useful because AI agents usually need different constraints at different points in the work. A planning task should not use the same instructions as a security review. A UI implementation should not follow the same evidence standard as a migration plan. A debugging session needs reproduction and localization before it needs a patch.

Agent Skills gives each phase its own operating procedure.


Why this is better than one giant prompt

A common pattern with AI coding tools is to create one huge CLAUDE.md, AGENTS.md, or rules file that tries to encode everything: code style, testing standards, architecture boundaries, deployment rules, review checklists, and personal preferences.

That works for a while, then it becomes noisy. The agent receives instructions it does not need for the current task. Important rules get buried. The model may follow generic advice instead of the specific workflow that matters right now.

Agent Skills uses progressive disclosure instead. The skill file is the entry point. Supporting references are pulled in only when needed. That keeps the active context smaller and more relevant.

The design choice is subtle but important:

  • A skill is not just documentation.
  • A skill is not just a prompt snippet.
  • A skill is a procedure with exit criteria.

That is the difference between “remember to test” and “do not call this done until the relevant test evidence exists.”


The anti-rationalization pattern is the strongest idea

The README calls out anti-rationalization as a key design choice. This is exactly the kind of thing agents need.

Agents are very good at producing plausible explanations for why a shortcut is acceptable:

ShortcutWhy it is risky
”This is a small change, no test needed”Small changes can still break shared behavior
”The code looks right”Looking right is not runtime evidence
”I will clean it up later”Later often means never
”The existing code is messy anyway”Messy surroundings do not justify more entropy
”The build probably passes”Probably is not a quality gate

A senior engineer catches these rationalizations instinctively. A coding agent needs them written down. Agent Skills bakes that into the workflow format.

This is the part I would copy into almost every serious agent setup: not just what to do, but the excuses the agent is not allowed to use.


Agent personas add another useful layer

The repository also includes specialist agent personas:

PersonaRole
code-reviewerStaff-level code review perspective
test-engineerQA and test strategy perspective
security-auditorSecurity and threat-modeling perspective

This maps well to how strong engineering teams work. The same person can write code, but the review lens is different from the implementation lens. The same agent can generate tests, but a test-engineer persona should ask different questions than a builder persona.

For AI coding, this separation helps prevent a common issue: the agent that wrote the code is too willing to accept its own work. A review persona gives the system permission to be stricter.


Tool support is broad by design

Agent Skills is not only for one environment.

The README includes setup paths for:

  • Claude Code, including marketplace installation
  • Cursor
  • Gemini CLI
  • Windsurf
  • OpenCode
  • GitHub Copilot
  • Kiro IDE and CLI
  • Codex and other agents that can consume instruction files

That portability comes from the format. The core content is mostly Markdown. Different tools may load it differently, but the underlying workflows are not locked to a single vendor.

For teams, that is useful. You can standardize the engineering process even if developers use different agent frontends.


Where I would use it immediately

I would not start by installing every workflow and expecting magic. I would start with the highest-risk points in the development loop.

1. Code review

The code-review-and-quality skill is probably the first place to start. Review is where hidden issues surface: unclear boundaries, missing tests, risky abstractions, inconsistent error handling, and changes that are too large to reason about.

If an agent can consistently review with a staff-engineer lens, it becomes useful even before you trust it to implement major features.

2. Test-driven development

The test-driven-development skill matters because agents often optimize for finishing the visible patch. A test workflow forces the conversation back to proof.

Good agent testing should answer:

  • What behavior changed?
  • What is the smallest test that proves it?
  • What edge case would break this?
  • Did the test fail before the fix?
  • Did the relevant test pass after the fix?

That is the difference between code generation and engineering.

3. Debugging and error recovery

Debugging is where agents can waste time confidently. A structured triage process matters: reproduce, localize, reduce, fix, guard.

Without that sequence, an agent may patch symptoms. With it, the agent is more likely to find the actual fault line.

4. Security and hardening

Security review is a natural fit for skills because it is checklist-heavy but context-sensitive. You want the agent to check input boundaries, secrets, auth assumptions, dependencies, headers, CORS, and injection risks, but only in ways relevant to the actual code.

This is not a replacement for real security review. It is a way to catch obvious misses before human review.


What this means for AI-assisted engineering

Agent Skills points toward a bigger shift: the next advantage in AI coding will not come only from stronger models. It will come from better operating systems around the models.

The model is the reasoning engine. Skills are the process memory.

That distinction matters. Senior engineers are not valuable only because they can type code. They carry judgment about sequencing, blast radius, rollback paths, test evidence, dependency risk, review quality, and when not to touch something. Agent Skills turns some of that judgment into reusable procedures.

This is why the project feels more important than a normal prompt library. It is not trying to make agents sound smarter. It is trying to make them work with more discipline.


The tradeoffs

There are tradeoffs.

First, structured workflows add friction. For a tiny one-line change, a full spec-plan-build-review-ship process can be too heavy. The trick is to use the right skill for the risk level.

Second, skills still depend on the model following instructions. They improve behavior, but they do not guarantee it. You still need verification, tests, and human judgment.

Third, teams will need to adapt the defaults. A startup prototype, a healthcare data platform, and a financial services backend should not share the exact same risk thresholds.

But those are good tradeoffs. A workflow you can tune is better than a vibe you cannot inspect.


Quick start

For Claude Code, the README shows marketplace installation:

/plugin marketplace add addyosmani/agent-skills
/plugin install agent-skills@addy-agent-skills

If SSH cloning is a problem, the project documents an HTTPS install path:

/plugin marketplace add https://github.com/addyosmani/agent-skills.git
/plugin install agent-skills@addy-agent-skills

For local development:

git clone https://github.com/addyosmani/agent-skills.git
claude --plugin-dir /path/to/agent-skills

Other tools use their own loading paths, but the core idea is the same: expose the skill files to your agent and let the relevant workflow activate for the task.


Final take

Agent Skills is worth paying attention to because it treats AI coding as an engineering workflow problem, not just a model capability problem.

The strongest teams using AI agents will not be the ones that ask for code fastest. They will be the ones that make agents follow the same quality loop good engineers already use: define clearly, plan in small units, build incrementally, verify with evidence, review critically, and ship carefully.

That is what this repository packages.

For serious codebases, this is the direction agent tooling needs to go.

Frequently asked questions

What is Agent Skills?

Agent Skills is an open-source workflow pack by Addy Osmani that gives AI coding agents reusable engineering skills, lifecycle commands, personas, hooks, and quality gates for production software work.

Why do AI coding agents need structured skills?

Structured skills help agents follow repeatable engineering processes such as writing specs, planning work, testing changes, reviewing code, simplifying complexity, and shipping with evidence.

Which teams benefit most from Agent Skills?

Teams using AI agents on real codebases benefit most, especially when they need stronger review discipline, test evidence, security checks, and consistent workflows across different agent tools.