SkillCI Documentation

Welcome! SkillCI is CI/CD for coding-agent configuration — it tests, scores, and gates changes to your skills, hooks, rules, and CLAUDE.md before you trust them in Claude Code, Cursor, or Codex.

Start here

Guide What it covers
Getting Started Install, run the offline demo, your first live evaluation, project layout. Read this first.
Concepts Baseline vs candidate, the verdict model, the three scoring dimensions, the regression gate.
CLI Reference Every command, flag, and exit code.
Writing Tasks Author task fixtures, objective checks, and judge rubrics.
Scoring The composite formula, thresholds, and how a verdict is decided.
Agents & Auth Claude/Cursor/Codex adapters, auth models, and adding a new adapter.
CI Integration Wire SkillCI into GitHub Actions as a PR gate.
Architecture Module-by-module map, data flow, and the shared contracts.
Troubleshooting Common issues and FAQ.

The one-paragraph mental model

You point SkillCI at two versions of an agent's config — the baseline (trusted) and a candidate (proposed change). It runs a suite of sandboxed tasks twice, once per config, driving a real coding agent headlessly. Each run is scored three ways (objective checks, LLM judge, cost), the scores are compared, and a verdict (improved / neutral / regressed) comes out. A PR is opened only when the candidate is improved with zero hard regressions. Everything also runs fully offline via a deterministic mock.

Contributing

See CONTRIBUTING.md and the Architecture guide.