Most deploys are an act of optimism. The team merges the change, pushes the button, and hopes nothing goes wrong. When something does go wrong, the rollback procedure is invented at 2am with several engineers looking at each other and asking what to do. The deploy succeeded technically (the pods are running, the binary is the new one) but the application is broken (latency tripled, error rate is unacceptable, a customer-visible feature is regressed) and now the team has to figure out how to get back to the previous state without making things worse. The rollback often takes longer than the original deploy because nobody planned for it.
Deploys done well look different. The strategy is chosen deliberately for the service: rolling updates for stateless services, canary releases when the team wants gradual exposure with automatic rollback on error rate, blue-green for instant cutover with the previous version warm. Smoke tests run after each step. Deploy gates check the right metrics before promoting traffic. The rollback procedure is documented before the deploy starts, so the on-call engineer at 2am has a runbook instead of a blank page. The discipline is well-known and rarely applied per service because picking the right strategy and writing the rollback procedure takes time. The /relay-deploy skill produces the artifact so the discipline lands as the default.
Why generalist AI ships unsafe deploys
Ask Cursor or ChatGPT for a deploy step. You get a kubectl apply or a gcloud run deploy. The command works. It also has no rollout strategy, no canary phase, no smoke test, and no rollback procedure. The deploy succeeds (the pods restart) and there is no signal whether the application is actually healthy under traffic. If the new version is broken, the team finds out from customers. The rollback is to manually deploy the previous SHA, which depends on the team remembering what the previous SHA was.
The other failure mode is the missing health signal. "Pod is ready" is not the same as "application is healthy under traffic." A pod can be ready (its readiness probe returns 200) and still be broken (every request returns 500). A canary release that promotes traffic based only on pod readiness ships the broken version. The right deploy strategy uses an actual application metric (error rate, latency) to gate promotion, not a pod-level proxy. /relay-deploy configures the right gates.
What deploy strategy actually requires
A deploy configuration has four parts. First, the strategy: rolling, canary, or blue-green, picked based on the service's tolerance for risk and the cost of holding two versions in production simultaneously. Stateless services with low risk benefit from rolling updates. Critical services where a regression would be expensive benefit from canary with metric-based gates. Services where instant cutover is required (or where rolling updates would cause inconsistency) benefit from blue-green. Second, the gates: which metrics confirm the new version is healthy at each promotion step. Third, the smoke tests: a small set of tests that run after each step, exercising the critical paths to catch obvious regressions before traffic is promoted. Fourth, the rollback: the explicit procedure to revert if any gate fails, with the timing and the responsible alerting.
The trade-off between strategies is between safety and speed. Rolling updates are fast and provide little protection. Canary is slower but catches regressions before they reach all traffic. Blue-green is instant cutover but requires holding twice the capacity briefly. The right choice depends on the service. The discipline is to make the choice deliberately and document the reasoning so the next person to extend the deploy understands the constraints.
How /relay-deploy works
Step one: characterize the service
When invoked, /relay-deploy asks for the service's characteristics: stateless or stateful, traffic profile (steady, spiky, scheduled), risk tolerance (consumer-facing critical path or internal tool), and any inconsistency constraints (can two versions run simultaneously, or do they share state that would conflict). The answers determine the strategy.
Step two: pick the strategy
Rolling for stateless services with low risk and steady traffic. Canary for critical services where gradual exposure with automatic rollback is the priority (typically 5% of traffic for 5 minutes, 25% for 10, 50% for 10, 100%, with rollback if error rate exceeds 2x baseline at any step). Blue-green for services that require instant cutover or where rolling would cause inconsistency. The choice is documented with the reasoning so the team can revisit if the service's characteristics change.
Step three: gates and smoke tests
Each promotion step has gates: error rate must remain below the threshold, p99 latency must remain within the band, custom application metrics (if relevant) must hold. Smoke tests run after each step: a small set of tests against the canary endpoint exercising the critical paths (auth flow, primary read path, primary write path). If a smoke test fails, the deploy halts and rolls back automatically.
Step four: documented rollback
The rollback procedure is documented before the deploy runs. For canary, rollback is automatic on gate failure. For rolling and blue-green, the procedure lists the explicit command (or button in the deploy tool) and the expected recovery time. The procedure also covers data: if the new version made schema changes that the old version cannot read, the rollback procedure includes the data rollback step. Without this, a deploy that needs to be reverted reveals an undocumented failure mode.
Canary releases that gate on pod readiness rather than application metrics give the team false confidence. /relay-deploy configures gates on actual application metrics (error rate, latency, custom signals) so the canary catches application-level regressions, not just pod-level ones.
Tonone's /relay-deploy skill designs deployment strategies (rolling, canary, blue-green) with metric-based gates, smoke tests, and explicit rollback procedures.
When to use /relay-deploy, and when not to
/relay-deploy is the right call when deciding how to deploy a service safely, when setting up zero-downtime deployments for a service that cannot have maintenance windows, or when needing a documented rollback plan before a high-stakes deploy.
Skip the skill for one-off scripts and manual deploys (the strategy is overhead). For full CI/CD pipeline configuration, /relay-pipeline is the right call. For container builds, /relay-docker produces production-ready Dockerfiles.
| Capability | Tonone | Generalist chatbot | Cursor / Copilot |
|---|---|---|---|
| Strategy chosen for service characteristics | Yes, rolling/canary/blue-green deliberate | Single deploy step | Not in scope |
| Gates on application metrics | Yes, error rate + latency | Pod readiness only | Not in scope |
| Smoke tests at each step | Yes, critical path coverage | No smoke tests | Not in scope |
| Documented rollback procedure | Yes, before deploy starts | Invent at 2am | Not in scope |
| Data rollback when needed | Yes, paired with code rollback | Not considered | Not in scope |
A worked example: canary deploy for a payments API
Suppose the brief is: configure a canary deploy for the payments API, where any regression in error rate or latency must trigger automatic rollback. Run /relay-deploy.
# argo-rollouts/payments-api.yaml (excerpt)
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata: { name: payments-api }
spec:
replicas: 12
strategy:
canary:
maxSurge: 25%
maxUnavailable: 0
steps:
- setWeight: 5
- pause: { duration: 5m }
- analysis:
templates: [{ templateName: payments-error-rate }]
- setWeight: 25
- pause: { duration: 10m }
- analysis:
templates: [{ templateName: payments-error-rate }]
- setWeight: 50
- pause: { duration: 10m }
- analysis:
templates: [{ templateName: payments-error-rate }]
- setWeight: 100
analysis:
templates:
- templateName: payments-latency-p99
startingStep: 1
# Analysis template: payments-error-rate
# Fails the rollout if error rate > 2x baseline for 1 minute
#
# Analysis template: payments-latency-p99
# Fails the rollout if p99 > 1.4x baseline for 2 minutes
# Smoke test (runs at each step)
# scripts/smoke/payments-canary.sh
# curl -f https://canary.payments.internal/health
# curl -f https://canary.payments.internal/v1/charges/test_smoke
#
# Rollback procedure:
# - Automatic via Argo Rollouts on analysis failure
# - Manual: kubectl argo rollouts abort payments-api
# (then kubectl argo rollouts undo payments-api)
# - Recovery time: ~30s for a 5-25% canary, ~2min for fullThe deploy is gated on actual application metrics. Regressions are caught at 5% traffic and rolled back automatically. The rollback procedure is documented and tested before the first canary runs. The on-call engineer at 2am has a runbook, not an invention task.
Related skills
/relay-deploy covers deploy strategy. For the full CI/CD pipeline that builds and ships to the deploy, /relay-pipeline is the right call. For the container that the deploy ships, /relay-docker produces production-ready Dockerfiles. For SLO-based alerts that catch deploy regressions, /vigil-alert is calibrated to that work.
Install
/relay-deploy ships with the Relay agent in the Tonone for Claude Code package. Install Tonone, invoke /relay-deploy from any Claude Code session, and the skill produces the deploy strategy calibrated to the service's risk profile and infrastructure.
1. Add to marketplace
2. Install Relay
Deploys are the moment new code meets production traffic. The skill is built so that meeting is gated, observed, and reversible by default.
Frequently asked questions
- What does /relay-deploy do?
- It designs a deployment strategy (rolling, canary, or blue-green) with metric-based gates, smoke tests at each promotion step, and explicit rollback procedures documented before the deploy runs.
- What deploy tools does /relay-deploy support?
- Argo Rollouts, Flagger, native Kubernetes Deployments, AWS ECS, GCP Cloud Run, Vercel, and Spinnaker. The skill matches the project's existing tool.
- How is /relay-deploy different from a single deploy command?
- A single command does not gate on metrics, run smoke tests, or document rollback. /relay-deploy produces all three together so the deploy is observed and reversible.
- When should I use /relay-deploy?
- When deciding how to deploy a service safely, when setting up zero-downtime deployments, or when needing a documented rollback plan before a high-stakes deploy.
- Does /relay-deploy handle data rollback?
- Yes. If the deploy includes schema changes that would prevent the old version from reading the data, the rollback procedure includes the data rollback step paired with the code rollback.
- How do I install /relay-deploy?
- Install Tonone for Claude Code via the get-started guide at tonone.ai/get-started. /relay-deploy ships with the Relay agent and is invoked as a slash command in any Claude Code session. Tonone is free and MIT-licensed.
- Is /relay-deploy free?
- Yes. The skill is part of Tonone, which is MIT-licensed. The only cost is Claude Code token usage during the work.
- Does /relay-deploy support Vercel and serverless deploys?
- Yes. For Vercel, the skill produces preview-then-production promotion with smoke tests. For serverless platforms with limited rollout primitives, the skill uses traffic-shifting where supported (Lambda aliases, Cloud Run revisions).