Unlock agent-native CI/CD with the RWX Skill

February 19, 2026
Jason Robinaugh
Jason Robinaugh
Unlock agent-native CI/CD with the RWX Skill

LLMs benefit from fresh context and up-to-date documentation, and RWX has released an official skill to enable coding agents to find relevant documentation, lint run definition files, and validate with real RWX runs.

#Firing it up

It's easy to add the RWX skill to any agentic tool you're using. Claude Code users can add it with the /plugin command, like so:

claude
❯ /plugin marketplace add rwx-cloud/skills
⎿ Successfully added marketplace: rwx
❯ /plugin install rwx
⎿ ✓ Installed rwx. Restart Claude Code to load new plugins.

The RWX skill can also be added to other coding assistants (OpenAI's Codex, Cursor, Github Copilot, and many more) using npx skills.

npx skills add rwx-cloud/skills

The skill can be invoked via slash command (e.g. "/rwx <prompt>") or using natural language (e.g. "Can you help me write an RWX run definition to test my code?"), and it allows an agent to offer valuable insights about RWX without hallucinating or stumbling through web searches.

We've seen great early success both improving existing run definitions and creating run definitions from scratch (or with prior art when migrating from other CI/CD services, like Github Actions).

#Better CLI tooling means skills stay fresh

When we were building the RWX skill, we knew we wanted it to be easy for agents to always have the right information. It's easy to put some quick notes in an AGENTS.md file, but that is harder to distribute, can become out of date, and takes up space in the context window even when it's not relevant to the task at hand.

With a skill, agents always load some brief frontmatter from the skill's markdown file, but otherwise just fetch detailed information when they need it.

Like most software projects, we approached building the RWX skill with the goal of shipping a "dumb client," where very little knowledge actually has to live on users' machines. This meant introducing more CLI tooling, such as the following:

  • rwx docs search <query> and rwx docs pull <path> - These commands are leveraged by agents to pull markdown versions of our documentation from rwx.com/docs, and are faster/more reliable than using Curl or WebFetch tool calls. We found that, often, WebFetch would summarize away important information, resulting in more hallucinations.
  • rwx lint and rwx lsp serve - These commands allow agents to get the same gut check on RWX syntax that users of our VS Code extension have enjoyed, right from the terminal.
  • rwx packages list and rwx packages show - These make it easy for agents to browse the built-in packages RWX maintains and get the latest documentation on the interfaces for using them.

In addition to those new tools, the RWX skill also ensures that agents know how to use the RWX CLI to test runs and analyze results, with commands like:

  • rwx run <file> --wait - When allowed by a user, the agent can get into a run -> fix -> run again loop until green.
  • rwx logs and rwx artifacts - These provide easy programmatic access to debug CI runs locally, without having to pop out to the browser.

By giving agents a series of deterministic tools they can use, they work much more efficiently and make more accurate decisions.

#Tight feedback loops

It's been fascinating to use the RWX skill as a way to find gaps in our documentation and nuances about the RWX platform that could be explained more clearly or explicitly. As an example, we found that Claude Code would flag "missing" explicit dependencies when one task used an output value of another task, and that led us to update our documentation to confirm how implicit dependencies like that actually work.

We've also been developing an eval framework, in a similar vein to what Dagster blogged about here. We'll likely blog more about ours as it matures, but it's a fantastic feedback loop that runs a headless Claude Code agent against a set of fixtures, runs a set of golang tests against the output to ensure it generated something reasonable, and outputs the JSON from that headless Claude session, which can then be fed into another agent to analyze where the headless agent got confused, burned tokens re-working something, or went off track (and why).

In addition to Claude's feedback, we'd naturally love to hear yours as you work with the RWX skill and as we all adapt to an agent-first, CLI-driven approach to CI/CD.

Never miss an update

Get the latest releases and news about RWX and our ecosystem with our newsletter.

Share this post

Enjoyed this post? Please share it on your favorite social network!