What does /cortex-model do?

It builds an end-to-end ML pipeline: data ingestion with validation, feature engineering with a feature store, training with cross-validation and hyperparameter search, evaluation against a held-out test set, deployment to a serving endpoint, and monitoring with drift detection.

What model types does /cortex-model support?

Classification, regression, ranking, and anomaly detection. The skill picks the right model family based on the problem characterization (data shape, success criteria, error costs).

How is /cortex-model different from a generalist building a model?

A generalist produces a notebook. /cortex-model produces the engineering layer around the model: orchestrated ingestion, feature store, experiment tracking, deployment with versioning, and monitoring with drift detection.

When should I use /cortex-model?

When building a prediction or classification model from labeled data for the first time and you want a production pipeline rather than a notebook. Also when an existing model needs proper versioning, evaluation, and serving infrastructure.

What feature stores does /cortex-model support?

Feast (open-source, recommended for greenfield), Tecton (managed), and project-specific approaches when those are already in use. The skill matches the existing tool rather than imposing a new one.

How do I install /cortex-model?

Install Tonone for Claude Code via the get-started guide at tonone.ai/get-started. /cortex-model ships with the Cortex agent and is invoked as a slash command in any Claude Code session. Tonone is free and MIT-licensed.

Is /cortex-model free?

Yes. The skill is part of Tonone, which is MIT-licensed. The only cost is Claude Code token usage during the work plus the compute cost of training and serving.

Does /cortex-model handle drift detection?

Yes. Deployed models include input distribution monitoring (Kolmogorov-Smirnov test on each feature) and prediction distribution monitoring, with alerts when distributions shift beyond a threshold.

AI End-to-End ML Pipeline Builder

Most ML models that get prototyped never make it to production, and most that do make it to production are running off scripts that nobody can reproduce. The reason is the gap between a notebook and a pipeline. A notebook is a sequence of cells that ran once on the analyst's laptop. A pipeline is a versioned, scheduled, reproducible system: ingestion that handles new data, feature engineering that runs the same way at training and inference, training that produces an artifact you can compare to previous artifacts, evaluation that catches regressions, and a serving layer that exposes the model with the same monitoring everything else in the stack has. Bridging that gap is the work that ML practitioners call MLOps, and it is the work that gets skipped because the notebook produced a result and the result felt like the deliverable.

The skip is paid back the first time the model needs to be retrained, the first time the team wants to compare two model versions, and the first time the model's accuracy degrades silently in production because nobody is watching. Each of these recovery moments is more expensive than building the pipeline correctly the first time, and the cumulative cost is why teams that take ML seriously build the pipeline early. The /cortex-model skill is built to produce that pipeline as the default rather than the cleanup pass: data ingestion with validation, feature engineering with a feature store, training with cross-validation and hyperparameter tuning, evaluation against a held-out test set, and deployment to a serving endpoint with monitoring.

Why generalist AI ships notebooks instead of pipelines

Ask Cursor or ChatGPT to build a classifier for your data. You get a notebook. The notebook reads the data with pandas, splits it into train and test, trains a model, prints accuracy. The notebook is correct as a prototype; it is not a pipeline. The data ingestion is hardcoded to the local CSV path. The feature engineering is inlined into the training cell, so it cannot run at inference time without copy-pasting. The model is saved with an unversioned blob serializer. The evaluation is a single accuracy number with no held-out set, no cross-validation, no comparison to a previous version. There is no serving layer. There is no monitoring. The notebook works for the demo and breaks the moment somebody tries to use the model for anything real.

The deeper issue is that ML in production is the work the notebook represents only ten percent of. The other ninety percent is the engineering around the model: the feature store that ensures train/serve consistency, the experiment tracking that lets the team compare model versions, the deployment pipeline that ships the model to a serving endpoint, the monitoring that alerts when accuracy or input distribution drifts. A generalist tool produces the ten-percent prototype because that is what the prompt looks like. /cortex-model produces the ninety percent that makes the prototype useful.

What an end-to-end pipeline requires

A useful ML pipeline has six layers. First, data ingestion: the pipeline reads from the source of truth (data warehouse, event stream, application database) and validates the data before training. Second, feature engineering with a feature store: features are computed once and reused at both training and inference, so the model is not silently broken by train/serve skew. Third, training: cross-validation, hyperparameter search, with experiment tracking so the team can compare model versions deliberately. Fourth, evaluation: a held-out test set, with the metrics calibrated to the actual business outcome (precision/recall trade-offs that match the cost of false positives and false negatives). Fifth, deployment: a serving layer that exposes the model with versioning, with the option to A/B test new versions against the current one. Sixth, monitoring: input distribution drift detection, prediction distribution monitoring, and accuracy tracking against ground truth as it becomes available.

Each layer is its own discipline. Skipping any of them creates a specific failure mode: skip ingestion validation and the model gets trained on bad data, skip the feature store and you get train/serve skew, skip evaluation rigor and you ship regressions, skip monitoring and you discover the silent decay only when a customer reports a bad prediction. Building all six together is the discipline; doing it cheaply is what /cortex-model is built for.

How /cortex-model works

Step one: characterize the problem

When invoked, /cortex-model asks for the problem in concrete terms: what is being predicted, what input data is available, what success looks like, what the cost of a false positive vs false negative is. The answers determine the model type (classification, regression, ranking, anomaly detection), the evaluation metrics, and the architectural decisions (do we need a feature store, does the inference need to be sub-100ms, does the training need to retrain weekly or quarterly).

Step two: ingestion and feature engineering

The skill produces the data ingestion layer (Airflow, Dagster, Prefect, or the project's existing orchestrator) with validation rules that catch malformed inputs before training. Feature engineering is implemented in a feature store (Feast, Tecton, or a project-specific approach) so the same feature definitions run at training and inference. The discipline is to make the features reusable across model versions rather than reimplementing them per-model.

Step three: training and evaluation

Training uses cross-validation with the right number of folds for the dataset size. Hyperparameter search uses Bayesian optimization or grid search depending on the search space. Experiment tracking (MLflow, Weights & Biases, or the project's existing tool) records every run with the hyperparameters, the dataset version, the metrics, and the artifact. Evaluation uses a held-out test set with metrics calibrated to the cost of errors: precision-at-k for ranking, F1 for classification with imbalanced classes, MAE/RMSE for regression with outlier sensitivity considered.

Step four: deployment and monitoring

The model is deployed to a serving endpoint with versioning so a new model can be deployed alongside the current one and traffic can be shifted gradually. Monitoring covers input distribution (Kolmogorov-Smirnov test on each feature), prediction distribution (changes in the histogram of predictions), and accuracy as ground truth arrives. Alerts fire when any metric crosses a threshold so the team catches the silent decay before the customer reports it.

Train/serve skew is the most common production ML bug and the hardest to debug. /cortex-model uses a feature store so the same feature code runs in both contexts; this single decision prevents most of the bugs that catch teams the first time they put a model in production.

Tonone's /cortex-model skill builds end-to-end ML pipelines: data ingestion with validation, feature engineering with a feature store, training with cross-validation and hyperparameter tuning, evaluation, deployment, and monitoring.

When to use /cortex-model, and when not to

/cortex-model is the right call when building a prediction, classification, or regression model from labeled data for the first time and the team wants a complete pipeline rather than a notebook. The skill is also the right call when an existing model is running as a script and needs proper versioning, evaluation infrastructure, and a serving layer.

Skip the skill for LLM-powered features (use /cortex-prompt for prompt design and /cortex-integrate for production integration). For pure exploratory data analysis without a deployment target, a notebook is fine. For evaluation of an existing production model (drift detection, accuracy tracking), /cortex-eval is the right call.

Capability	Tonone	Generalist chatbot	Cursor / Copilot
Reproducible data ingestion	Yes, orchestrated and validated	Hardcoded local paths	Not in scope
Feature store for train/serve consistency	Yes, prevents skew	Inline feature code	Not in scope
Cross-validation and hyperparameter search	Yes, calibrated to dataset	Single train/test split	Not in scope
Experiment tracking	Yes, MLflow / W&B integration	No tracking	Not in scope
Serving + monitoring with drift detection	Yes, by default	Unversioned blob in a folder	Not in scope

A worked example: churn prediction pipeline

Suppose the brief is: build a churn prediction model. Run /cortex-model and the output is the pipeline plus the supporting artifacts.

python

# pipelines/churn/train.py (excerpt)
from feast import FeatureStore
import mlflow
from sklearn.model_selection import StratifiedKFold
from sklearn.ensemble import GradientBoostingClassifier

def train(run_name: str, label_query: str):
    fs = FeatureStore(repo_path='features')

    # 1. Load training data + features from the feature store
    labels = read_warehouse(label_query)
    features = fs.get_historical_features(
        entity_df=labels,
        features=[
            'customer_features:days_since_signup',
            'customer_features:invoice_failure_count_30d',
            'customer_features:plan_tier',
            # ...12 more
        ],
    ).to_df()

    X, y = features.drop('churned', axis=1), features['churned']

    # 2. Cross-validation with hyperparameter search (Optuna)
    best_params = optuna_search(X, y, n_trials=50)

    # 3. Train final model on full set with best params
    model = GradientBoostingClassifier(**best_params)
    cv_scores = cross_validate_with_calibration(model, X, y, cv=StratifiedKFold(5))

    # 4. Held-out test set evaluation
    test_metrics = evaluate_on_holdout(model, holdout_set='2026-Q1')

    # 5. Track to MLflow with versioned artifact
    with mlflow.start_run(run_name=run_name):
        mlflow.log_params(best_params)
        mlflow.log_metrics(test_metrics)
        mlflow.sklearn.log_model(model, 'model', registered_model_name='churn-v3')

    return model

# pipelines/churn/serve.py (excerpt)
# Serves the registered model via FastAPI with feature lookup
# from the same feature store, so train/serve features match.
# Monitoring sidecar logs feature distributions for drift detection.

The pipeline is reproducible. The features come from the same feature store at training and inference. The training is tracked in MLflow so the team can compare runs. Drift monitoring runs alongside serving. When the model needs to be retrained, the pipeline runs the same way as the original training; when a new version needs to ship, the model registry handles versioning. That is what crossing from notebook to production looks like.

CortexML

/cortex-model

Builds an end-to-end ML pipeline: data ingestion with validation, feature engineering with a feature store, model training with cross-validation and hyperparameter tuning, evaluation against a held-out test set, and deployment to a serving endpoint with monitoring. Chooses the right model type for the problem.

When building a prediction, classification, or reg…

/cortex-model builds the pipeline. For LLM-powered features, /cortex-prompt covers prompt design and /cortex-integrate covers production integration. For evaluation of an existing model, /cortex-eval produces the drift and accuracy reports.

CortexML

/cortex-prompt

Designs, versions, and evaluates prompts for LLM-powered features. Builds evaluation suites that test prompts against a representative set of inputs so quality regressions are caught before they reach users. Applies systematic prompt engineering techniques and documents what each version changes and why.

When building AI features where prompt quality dir…

CortexML

/cortex-integrate

Integrates an LLM into a production service: API client with retry and timeout handling, streaming response support, semantic caching to avoid redundant API calls, fallback provider configuration, and cost controls including token budget enforcement and request rate limiting per user.

When adding an AI feature to an existing product f…

CortexML

/cortex-eval

Evaluates a deployed model or LLM integration for performance issues: checks for accuracy degradation against a reference dataset, data distribution drift that may explain behavior changes, latency regression compared to baseline, and cost increases from token usage changes. Produces a health report with recommended actions.

When model outputs seem worse after a retraining r…

Install

/cortex-model ships with the Cortex agent in the Tonone for Claude Code package. Install Tonone, invoke /cortex-model from any Claude Code session, and the skill produces the end-to-end pipeline calibrated to the project's data and serving infrastructure.

1. Add to marketplace

$ claude plugin marketplace add tonone-ai/tonone

2. Install Cortex

$ claude plugin install cortex@tonone-ai

ML pipelines that survive contact with production are the ones that did the engineering work upfront. The skill is built so that work is the default, not the cleanup.

Frequently asked questions

What does /cortex-model do?: It builds an end-to-end ML pipeline: data ingestion with validation, feature engineering with a feature store, training with cross-validation and hyperparameter search, evaluation against a held-out test set, deployment to a serving endpoint, and monitoring with drift detection.
What model types does /cortex-model support?: Classification, regression, ranking, and anomaly detection. The skill picks the right model family based on the problem characterization (data shape, success criteria, error costs).
How is /cortex-model different from a generalist building a model?: A generalist produces a notebook. /cortex-model produces the engineering layer around the model: orchestrated ingestion, feature store, experiment tracking, deployment with versioning, and monitoring with drift detection.
When should I use /cortex-model?: When building a prediction or classification model from labeled data for the first time and you want a production pipeline rather than a notebook. Also when an existing model needs proper versioning, evaluation, and serving infrastructure.
What feature stores does /cortex-model support?: Feast (open-source, recommended for greenfield), Tecton (managed), and project-specific approaches when those are already in use. The skill matches the existing tool rather than imposing a new one.
How do I install /cortex-model?: Install Tonone for Claude Code via the get-started guide at tonone.ai/get-started. /cortex-model ships with the Cortex agent and is invoked as a slash command in any Claude Code session. Tonone is free and MIT-licensed.
Is /cortex-model free?: Yes. The skill is part of Tonone, which is MIT-licensed. The only cost is Claude Code token usage during the work plus the compute cost of training and serving.
Does /cortex-model handle drift detection?: Yes. Deployed models include input distribution monitoring (Kolmogorov-Smirnov test on each feature) and prediction distribution monitoring, with alerts when distributions shift beyond a threshold.

Pairs well with

Cortex

Add real machine learning to your product, not just an API call.

Flux

Database schemas and data pipelines that scale without surprises.

Spine

APIs and services built for production, not tutorials.

Machine-readable version →Raw JSON →Install Cortex

AI End-to-End ML Pipeline Builder

Why generalist AI ships notebooks instead of pipelines

What an end-to-end pipeline requires

How /cortex-model works

Step one: characterize the problem

Step two: ingestion and feature engineering

Step three: training and evaluation

Step four: deployment and monitoring

When to use /cortex-model, and when not to

A worked example: churn prediction pipeline

/cortex-model

Related skills

/cortex-prompt

/cortex-integrate

/cortex-eval

Install

Frequently asked questions

Pairs well with