Changelog
2026-04-12by Fatih
Agent eval suite: 25 tests guard output quality
25 automated tests now validate every agent and skill in CI, catching formatting drift, missing contracts, and schema violations before they ship.
improvement
What's new
Two new test files run on every pull request:
- -`test_agent_compliance.py`: 11 tests covering frontmatter schema, model selection, communication protocol, identity lines, severity indicators, and emoji usage
- -`test_skill_compliance.py`: 14 tests covering output-kit contract, atlas-report overflow clause, naming conventions, and structured workflow steps
Drift tracking uses explicit allowlists: known violations are enumerated and tests fail when drift is fixed but not removed from the allowlist, so nothing slips through unreviewed.
Also fixed
12 skills were missing the output-kit contract reference line. Fixed in: draft-landing, draft-patterns, form-exam, form-palette, form-style, lens-chart, pitch-landing, prism-chart, prism-stack, proof-design, surge-landing, touch-ui.