SkillCI Documentation

Welcome! SkillCI is CI/CD for coding-agent configuration — it tests, scores, and gates changes to your skills, hooks, rules, and CLAUDE.md before you trust them in Claude Code, Cursor, or Codex.

Start here

Guide	What it covers
Getting Started	Install, run the offline demo, your first live evaluation, project layout. Read this first.
Concepts	Baseline vs candidate, the verdict model, the three scoring dimensions, the regression gate.
CLI Reference	Every command, flag, and exit code.
Writing Tasks	Author task fixtures, objective checks, and judge rubrics.
Scoring	The composite formula, thresholds, and how a verdict is decided.
Agents & Auth	Claude/Cursor/Codex adapters, auth models, and adding a new adapter.
CI Integration	Wire SkillCI into GitHub Actions as a PR gate.
Architecture	Module-by-module map, data flow, and the shared contracts.
Troubleshooting	Common issues and FAQ.

The one-paragraph mental model

You point SkillCI at two versions of an agent's config — the baseline (trusted) and a candidate (proposed change). It runs a suite of sandboxed tasks twice, once per config, driving a real coding agent headlessly. Each run is scored three ways (objective checks, LLM judge, cost), the scores are compared, and a verdict (improved / neutral / regressed) comes out. A PR is opened only when the candidate is improved with zero hard regressions. Everything also runs fully offline via a deterministic mock.

Contributing

See CONTRIBUTING.md and the Architecture guide.