Skip to main content
Back to the field guide

A field guide to the /spine-perf skill

Fix Backend Performance Bottlenecks with AI

Most slow endpoints have one of five root causes. /spine-perf diagnoses N+1 queries, missing indexes, sync I/O, cache gaps, and pool exhaustion, then fixes them in priority order.

Spine · Backend10 min readMarch 12, 2026

Backend performance bugs almost always come from one of five root causes. The endpoint runs an N+1 query because the ORM was used naively. The database is missing an index that the query needs. A piece of synchronous I/O sits in a request handler and blocks the event loop. A cache that should exist does not, so every request hits the slow path. The connection pool is too small or has leaked, so requests queue waiting for a connection that never frees. The five causes account for a vast majority of latency complaints, and the fix for each is well-understood. The reason teams chase these for weeks is not that the fixes are hard; it is that diagnosing which of the five is responsible takes time the team did not budget for.

A senior backend engineer asked to fix a slow endpoint runs through a checklist. Look at the query log, count queries per request. Look at the explain plans for the high-cost queries. Look at the request handler for synchronous I/O. Look at the cache hit rate for the relevant key prefix. Look at the connection pool metrics. The checklist takes an hour to run through and points at the answer roughly nine times out of ten. The remaining time is on the genuinely interesting cases (a hot path on a third-party API, a memory leak, an inefficient serialization). Most teams do not run the checklist because nobody wrote it down; the senior engineer who knows it runs it from memory. The /spine-perf skill encodes the checklist so it gets applied every time, not just when the senior engineer has time.

Why generalist AI gives bad performance advice

Ask Cursor or ChatGPT "how do I make this endpoint faster" and you get a list of generic optimizations: add caching, use async, profile, look at indexes. The list is correct in the abstract and useless in practice because it is not grounded in your endpoint's actual behavior. The real answer is one of the five causes, and the only way to find which is to look at the actual data. A generalist tool cannot read the query log, cannot run an explain plan, cannot inspect the connection pool metrics. It can suggest fixes for a hypothetical slow endpoint; it cannot diagnose the slow endpoint in front of you.

The other failure mode is the suggestion bias. Generalist tools tend to suggest the most visible optimizations ("add a cache") rather than the most effective ones ("this query is missing an index"). A cache in front of a query that runs once per request is wasted complexity; the index that fixes the underlying query is the right answer. The diagnostic step is the difference between guessing at a fix and applying the right fix the first time.

What performance diagnosis actually requires

A useful performance pass has three phases. Phase one: characterize the bottleneck. Where is the time going? Database, application code, external API, network? The answer comes from per-request traces or per-handler profiling. Phase two: identify the root cause from the five usual suspects. Once the time is localized to the database, look for N+1 patterns and missing indexes; once it is localized to handler code, look for sync I/O and serialization waste; once it is localized to the connection layer, look for pool exhaustion and DNS lookups. Phase three: apply the fix and verify. The fix is rarely the dramatic refactor; it is usually a single index, a single batch query, a single async wrapper. The verification is the rerun of the trace to confirm the time has moved.

The discipline is sequential. Skipping the diagnosis step (jumping to a cache because caches are visible) wastes effort. Skipping the verification step (declaring victory before the trace confirms) leaves the bug to recur. The discipline is also surprisingly compressible: the trace, the explain plan, and the pool metrics are all available in any reasonably instrumented service, and reading them is the work that points at the answer.

How /spine-perf works

Step one: localize the time

When invoked, /spine-perf reads the slow endpoint's traces (from the project's APM) and produces a time breakdown: how much time is in the database, the application code, the external APIs, and the network. The breakdown is the first input to the diagnosis. If 90% of the time is in the database, the rest of the analysis focuses there. If 90% is in handler code, the focus shifts to sync I/O and serialization. The skill does not guess at the breakdown; it reads it from the data the project already collects.

Step two: walk the five usual suspects

For each suspect, the skill checks the relevant data. N+1: count queries per request and check for repeated query patterns. Missing index: pull explain plans for the high-cost queries and check for sequential scans. Sync I/O: scan the handler for blocking calls (file I/O in a route, synchronous HTTP calls). Cache gaps: check the cache hit rate for the relevant key prefix. Pool exhaustion: check pool size, in-use count, and wait time. Each suspect produces a yes/no/maybe verdict with the supporting evidence.

Step three: produce the prioritized fix list

The output is a ranked list of fixes with the expected impact and the effort. "Add an index on users(account_id, created_at): estimated 40% latency reduction, 1 hour to apply." "Batch the N+1 query in /api/projects/:id/contributors: estimated 60% reduction, 1 day to refactor." "Increase pool size from 20 to 50: 0% latency reduction at current load, prevents queue-wait under spike." The list is ordered by impact-per-effort so the team applies the highest-leverage fix first.

Step four: apply and verify

Once the team approves a fix, the skill applies it and reruns the trace to verify. The verification is the proof that the fix worked: the latency moved, the query count dropped, the pool wait time went to zero. If the verification does not show the expected improvement, the diagnosis was wrong and the skill loops back to step two with the new data.

The most underappreciated optimization is the index. A single index on the right columns can produce 10-100x latency reductions, and most slow endpoints have at least one missing index waiting to be added. /spine-perf checks for this first because it is so often the answer.

Tonone's /spine-perf skill diagnoses backend performance bottlenecks against the five usual suspects (N+1, missing indexes, sync I/O, cache gaps, pool exhaustion) and produces a prioritized fix list with expected impact.

When to use /spine-perf, and when not to

/spine-perf is the right call when specific endpoints are slow under normal load, when query counts per request seem unexpectedly high, or when preparing for a traffic spike and the team wants to know what will fail first. The signal is when the team is investigating a performance complaint without a clear hypothesis. Run the skill before guessing at fixes.

Skip the skill for slow queries specifically (use /flux-query for direct query optimization with execution plan analysis). For frontend performance, /prism-audit covers bundle size and rendering. For full-system load testing and contract tests, /proof-api is the right call.

CapabilityTononeGeneralist chatbotCursor / Copilot
Reads actual traces from APMYes, time breakdown per layerGeneric optimization adviceNot applicable
Walks the five usual suspects systematicallyYes, each with evidenceRandom suggestion orderNot in scope
Prioritized fix list with impact estimateYes, ranked impact-per-effortUnranked listNot in scope
Verifies fix with trace rerunYes, latency before/afterNo verification stepNot applicable
Distinguishes diagnostic from optimizationYes, root cause firstSuggests fixes without diagnosisNot in scope

A worked example: /api/projects/:id is slow

Suppose the brief is: GET /api/projects/:id has p99 of 2.4s in production. Run /spine-perf and the output is the diagnosis.

text
Endpoint: GET /api/projects/:id
Observed: p99 2.4s, p50 380ms (last 24h, prod)

Time breakdown:
  database: 88% (2.1s)
  handler:   8% (190ms)
  network:   3% (70ms)
  other:     1%

Query count per request: 47 (high; baseline ~5)

Diagnosis: N+1 query (high confidence)
  - Hot loop in handler iterates project.tasks and calls
    task.assignee per task.
  - Each iteration triggers a SELECT users WHERE id = ?.
  - Average project: 40 tasks, hence 40+ extra queries.

Fix (recommended):
  Use the ORM's includes/preload to fetch tasks with assignees
  in a single query. Estimated impact: p99 2.4s -> 320ms.
  Effort: 30 minutes (single eager-load change).

Secondary findings:
  - users(id) index present and used (not the issue here).
  - Connection pool at 70% utilization; not exhausted.
  - No cache layer for project detail; consider after fixing
    the N+1 (cache adds value once base query is fast).

Verification plan:
  After fix lands, observe p99 in APM for 1 hour;
  expect <400ms p99 if diagnosis is correct.

The diagnosis is the value: the team now knows the issue is N+1, not a missing index, not a cache gap. The fix is 30 minutes. The verification step closes the loop. Without /spine-perf, the team would have spent a day chasing the wrong fixes (adding a cache, increasing the pool size) before someone got around to counting the queries.

/spine-perf covers backend performance diagnosis. For slow database queries specifically, /flux-query is calibrated to execution plan analysis and index recommendations. For API contract and load testing, /proof-api is the right call. For frontend performance, /prism-audit handles bundle and rendering.

Install

/spine-perf ships with the Spine agent in the Tonone for Claude Code package. Install Tonone, configure APM tool credentials, and the skill diagnoses bottlenecks against actual production traces.

1. Add to marketplace

$ claude plugin marketplace add tonone-ai/tonone

2. Install Spine

$ claude plugin install spine@tonone-ai

Performance bugs end faster when the diagnosis is grounded in evidence rather than guesses. The skill is built so the diagnosis loop runs in minutes instead of days.

Frequently asked questions

What does /spine-perf do?
It diagnoses backend performance bottlenecks by reading APM traces, localizing the time per layer, and walking the five usual suspects (N+1 queries, missing indexes, sync I/O, cache gaps, pool exhaustion). The output is a prioritized fix list.
How is /spine-perf different from a generalist AI suggesting optimizations?
A generalist returns generic advice. /spine-perf reads the actual traces, query logs, and connection pool metrics, then produces a fix grounded in evidence rather than intuition.
When should I use /spine-perf?
When specific endpoints are slow under normal load, when query counts per request seem high, or when preparing for a traffic spike and you want to know what will fail first.
What APM tools does /spine-perf support?
Datadog APM, New Relic, Honeycomb, Sentry Performance, and OpenTelemetry-based stacks. The skill reads from whichever the project has configured.
Does /spine-perf apply the fixes?
Yes, after the team approves the prioritized fix list. The skill applies the change, reruns the trace, and reports whether the latency moved as expected. If not, it loops back to the diagnosis.
How do I install /spine-perf?
Install Tonone for Claude Code via the get-started guide at tonone.ai/get-started. /spine-perf ships with the Spine agent and requires APM credentials. Tonone is free and MIT-licensed.
Is /spine-perf free?
Yes. The skill is part of Tonone, which is MIT-licensed. The only cost is Claude Code token usage during the work.
What is the difference between /spine-perf and /flux-query?
/spine-perf diagnoses the endpoint and identifies which layer is slow. /flux-query optimizes specific slow database queries with execution plan analysis. Use /spine-perf first to localize the issue, then /flux-query if the issue is at the query level.

Pairs well with