/cortex-eval

CortexML & AI

Evaluates a deployed model or LLM integration for performance issues: checks for accuracy degradation against a reference dataset, data distribution drift that may explain behavior changes, latency regression compared to baseline, and cost increases from token usage changes. Produces a health report with recommended actions.

Install

This skill

Install Cortex Eval

1. Add to marketplace

$ claude plugin marketplace add tonone-ai/tonone

2. Install this skill

$ claude plugin install cortex-eval@tonone-ai

The agent

Install Cortex

1. Add to marketplace

$ claude plugin marketplace add tonone-ai/tonone

2. Install Cortex

$ claude plugin install cortex@tonone-ai

See all Cortex skills

Want all 31 agents across both teams?

Full installation guide

Invoke this skill

Command|$ /cortex-eval

When to use

When model outputs seem worse after a retraining run or model version upgrade. When users are reporting worse AI feature quality without a clear cause. On a regular evaluation cadence, monthly or quarterly, after models have been in production long enough to drift.

Deep dive

AI Model Evaluation and Drift Check

Models decay silently. /cortex-eval checks accuracy regression against reference data, distribution drift, latency baseline, and token cost shifts.

Read the article

More from Cortex

All Cortex skills

CortexML

/cortex-model

Builds an end-to-end ML pipeline: data ingestion with validation, feature engineering with a feature store, model training with cross-validation and hyperparameter tuning, evaluation against a held-out test set, and deployment to a serving endpoint with monitoring. Chooses the right model type for the problem.

When building a prediction, classification, or reg…

CortexML

/cortex-prompt

Designs, versions, and evaluates prompts for LLM-powered features. Builds evaluation suites that test prompts against a representative set of inputs so quality regressions are caught before they reach users. Applies systematic prompt engineering techniques and documents what each version changes and why.

When building AI features where prompt quality dir…

CortexML

/cortex-integrate

Integrates an LLM into a production service: API client with retry and timeout handling, streaming response support, semantic caching to avoid redundant API calls, fallback provider configuration, and cost controls including token budget enforcement and request rate limiting per user.

When adding an AI feature to an existing product f…

Ready to use Cortex Eval?

Get Started Back to Marketplace