APIs that get consumed by other teams or external customers carry a different testing bar than internal services. A bug in an internal handler is a bug; a bug in a published API is a breaking change for every consumer. The cost is measured in customer support tickets, version negotiations, and the weekend somebody on the team spent rolling back a deploy that broke the integration partner's webhook ingestion. The bar that prevents this is contract testing: a test suite that proves the API still does what it promises to do, before the deploy lands. Most APIs ship without it because writing the suite takes time and the team optimistically assumed the API would not break.
A useful API test suite has three layers. Endpoint behavior tests verify each route returns the right status code, the right response shape, and the right error responses. Contract tests verify that the API still matches the published spec (OpenAPI, GraphQL schema, gRPC proto), so a change that violates the contract fails the build. Load tests establish a baseline for performance under realistic traffic so the team knows when a regression is performance-shaped rather than functional. Building all three together is the discipline; the /proof-api skill produces the suite calibrated to the project's API and existing test framework.
Why generalist AI under-tests APIs
Ask Cursor or ChatGPT to test an endpoint. You get a happy-path test: send the request, assert the status is 200. The test exercises one specific input, ignores the error responses, and never validates the full response shape. The contract is unchecked. The load behavior is unmeasured. The API ships with the test suite that catches whatever the engineer remembered to write and misses whatever they did not. That is fine for an internal experiment; it is not enough for an API that consumers depend on.
The harder failure mode is contract drift. A small change (renaming a field, tightening validation, changing error format) is not caught by happy-path tests because the happy path still works. The drift surfaces when a consumer fails after the deploy lands. Contract testing catches the drift at PR time by validating the response against the published schema. Most teams do not have this because writing the schema and the contract test feels redundant when the API code is right there. The redundancy is the point: it gives the team a second source of truth that catches the cases the API code itself cannot self-check.
What an API test suite should cover
Endpoint behavior tests verify the contract per route: the success status code, the success response shape (full schema validation, not just a key check), the error responses (401 for unauthenticated, 403 for unauthorized, 404 for missing, 422 for validation, with the right body shape for each). Contract tests verify that the API as a whole matches the published spec: every documented endpoint exists, every response field documented is returned, no fields are returned that are not documented. Load tests establish a baseline at expected traffic, with the expected p99 latency and the expected error rate, so future PRs that regress these fail the build instead of degrading silently.
The discipline is to scope the suite to the contract, not to the implementation. Implementation tests are unit tests; contract tests answer a different question. The right contract test for POST /invoices does not check whether the implementation calls the right database function; it checks whether the response matches the documented spec. The two should not be conflated, and treating them separately produces a smaller, more durable suite.
How /proof-api works
Step one: read the API spec
When invoked, /proof-api reads the API spec (OpenAPI, GraphQL schema, gRPC proto) and the existing test framework (Vitest with Supertest, pytest with httpx, JUnit with RestAssured, etc.). The spec is the contract; the test framework is the runtime. The skill produces tests that match the spec and run inside the framework.
Step two: endpoint behavior tests
Each endpoint gets behavior tests covering the success path (with a representative input, assert status and full response shape) and the documented error responses (auth missing, auth invalid, validation fail, not found, conflict). The schema validation uses the spec directly so the assertion stays in sync with the documentation: if the spec says a field is required, the test verifies it appears; if the spec says an enum has three values, the test verifies the response is one of them.
Step three: contract tests at the suite level
On top of the per-endpoint tests, a contract pass verifies the API as a whole. Every endpoint in the spec is reachable. Every response field documented is returned. No fields outside the spec leak through. The contract pass runs in CI so a PR that introduces drift fails before merge. For projects that publish multiple API versions, the contract is checked per version with the right deprecation notice for each.
Step four: load test baselines
Load tests use k6, Locust, or the project's existing load tool. The baseline runs at expected traffic levels (read endpoints at typical concurrency, write endpoints at peak hourly throughput) and records the p99 latency and the error rate. Future runs compare to the baseline; a regression in either metric fails the build. The load test does not run on every PR (too expensive); it runs nightly and on release branches.
Schema validation in tests catches more bugs than schema validation at runtime, because tests fail loudly at PR time and runtime validation often gets disabled in production for performance. /proof-api validates the full response shape against the spec in every endpoint test.
Tonone's /proof-api skill builds endpoint behavior tests, contract tests, and load test baselines for REST, GraphQL, and gRPC APIs.
When to use /proof-api, and when not to
/proof-api is the right call when an API is going to external consumers, when changing an API and needing contract tests to catch downstream breakage, or when load testing a new service before launch. The skill is also the right call when a published API has incomplete coverage and the team is consolidating the tests into a single suite.
Skip the skill for purely internal handlers that are not part of a published API (use unit and integration tests in the existing framework). For E2E user journey testing through a UI, /proof-e2e is the right call. For test strategy at the project level, /proof-strategy is the right entry point.
| Capability | Tonone | Generalist chatbot | Cursor / Copilot |
|---|---|---|---|
| Endpoint tests against full response schema | Yes, schema validation per response | Status code only | Not in scope |
| Contract tests at suite level | Yes, every spec endpoint reachable, no drift | Per-endpoint only | Not in scope |
| Load test baselines | Yes, p99 + error rate locked | Not in scope | Not in scope |
| Error response coverage | Yes, 401/403/404/422 explicitly tested | Often missed | Often missed |
| Spec-aware test generation | Yes, OpenAPI/GraphQL/gRPC supported | Hand-written tests | Not applicable |
A worked example: tests for a payments API
Suppose the brief is: build the test suite for the payments API. Run /proof-api against the OpenAPI spec.
// tests/api/charges.spec.ts (excerpt)
import { describe, it, expect } from 'vitest';
import { client, validateAgainstSpec } from './_fixtures';
describe('POST /v1/charges', () => {
it('creates a charge with valid input', async () => {
const res = await client.post('/v1/charges', {
amount: 1000,
currency: 'usd',
source: 'src_test_visa',
});
expect(res.status).toBe(201);
validateAgainstSpec(res.body, 'Charge'); // full schema check
expect(res.body.amount).toBe(1000);
expect(res.body.status).toBe('succeeded');
});
it('returns 401 without auth', async () => {
const res = await client.unauthenticated().post('/v1/charges', {});
expect(res.status).toBe(401);
validateAgainstSpec(res.body, 'AuthError');
});
it('returns 422 on invalid currency', async () => {
const res = await client.post('/v1/charges', {
amount: 1000,
currency: 'XYZ',
source: 'src_test_visa',
});
expect(res.status).toBe(422);
validateAgainstSpec(res.body, 'ValidationError');
expect(res.body.errors).toContainEqual(
expect.objectContaining({ field: 'currency' })
);
});
});
// tests/contract.spec.ts
// Walks every path in openapi.json and verifies it is reachable
// and that responses match the spec. Fails CI if any drift exists.
// loadtests/charges.k6.js
// Sustains 200 RPS to POST /v1/charges; expects p99 < 350ms,
// error rate < 0.1%. Locked into a baseline file the next run
// compares against.Endpoint behavior, contract drift, and load baseline all locked in CI. Consumers of the payments API can rely on the documented contract because the suite catches violations before they ship.
Related skills
/proof-api covers API testing. For E2E user journeys through a UI, /proof-e2e is the right call. For test strategy at the project level, /proof-strategy produces the plan. For backend performance diagnosis, /spine-perf is calibrated to bottleneck identification.
Install
/proof-api ships with the Proof agent in the Tonone for Claude Code package. Install Tonone, invoke /proof-api from any Claude Code session, and the skill produces the API test suite calibrated to the project's spec and test framework.
1. Add to marketplace
2. Install Proof
API contracts that hold over time are the ones tested as contracts. The skill is built so the discipline is the default, not the cleanup pass.
Frequently asked questions
- What does /proof-api do?
- It builds API test suites covering endpoint behavior (full response schema validation, error responses), contract tests at the suite level (catches drift between spec and implementation), and load test baselines for p99 latency and error rate.
- What API styles does /proof-api support?
- REST (with OpenAPI), GraphQL (with the published schema), and gRPC (with proto files). The skill matches the project's existing test framework and produces tests in that framework's syntax.
- How is /proof-api different from a generalist writing tests?
- A generalist writes per-endpoint happy-path tests. /proof-api validates the full response shape against the API spec, covers error responses, adds contract tests at the suite level, and generates load test baselines.
- When should I use /proof-api?
- When an API is going to external consumers, when changing an API and needing contract tests to catch downstream breakage, or when load testing a new service before launch.
- Does /proof-api integrate with my CI?
- Yes. The endpoint and contract tests run on every PR; load tests run nightly and on release branches because they are too expensive to run per PR.
- How do I install /proof-api?
- Install Tonone for Claude Code via the get-started guide at tonone.ai/get-started. /proof-api ships with the Proof agent and is invoked as a slash command in any Claude Code session. Tonone is free and MIT-licensed.
- Is /proof-api free?
- Yes. The skill is part of Tonone, which is MIT-licensed. The only cost is Claude Code token usage during the work.
- Does /proof-api handle GraphQL operations?
- Yes. GraphQL queries and mutations are tested with full response shape validation against the schema, plus contract tests that catch deprecations and breaking changes.