What does Tonone's Surge do?

Surge is Tonone's AI growth engineer. It diagnoses activation failure modes, builds retention playbooks by disengagement pattern, designs growth experiments with pre-registered hypotheses and kill conditions, maps PLG loops for specific products, and audits the full growth equation to identify the weakest link before any work begins.

What is the aha moment and how does Surge identify it?

The aha moment is the specific user action or outcome most correlated with long-term retention, the moment when a new user gets the value that makes them come back. Surge's surge-activation skill identifies it by correlating early-session behaviors with long-term retention data, then designs the onboarding flow to minimize time-to-aha.

What is a kill condition and why does Surge include one in every experiment?

A kill condition is a pre-specified signal that triggers early stopping of an experiment if the variant is causing harm before the full sample is collected. Examples include a drop in signup completion rate, an increase in support ticket volume, or a degradation in a guardrail metric. Surge includes kill conditions in every experiment spec to prevent experiments from running to statistical significance while causing undetected damage.

How is Surge different from using a growth playbook or Reforge content?

Growth playbooks and courses describe frameworks and case studies from specific companies. Surge applies growth principles to your product's specific growth equation, your aha moment, your funnel drop-off points, your retention cohorts. The output is interventions calibrated to your context, not frameworks you need to translate yourself.

Can Surge help with product-led growth if my product has a complex B2B buying process?

Yes. surge-plg is designed for this nuance. For products where full PLG is not viable (enterprise contracts, complex buying committees), it identifies the limited PLG mechanics that can still be built in, shareable outputs, viral features, referral mechanics, and prioritizes them by expected impact on acquisition cost. Not every product can be Dropbox, but almost every product has some PLG surface.

What is the difference between Surge and Lumen for experiment design?

Lumen's lumen-abtest skill handles the statistical design of A/B tests, power calculations, sample sizes, and decision rules for product analytics experiments. Surge's surge-experiment skill handles the growth strategy layer, what to test, why, in what sequence, with what kill conditions, and connects to Lumen for the statistical rigor. They work best together.

Is Tonone's Surge free?

Yes. Tonone is MIT-licensed and free to use. Surge is one of 23 agents included in the Tonone package. You pay only for Claude Code token usage during the work itself.

How do I start with Surge if I have never done structured growth work?

Start with surge-recon. It maps your current growth equation, acquisition rate, activation rate, retention at day-7, day-30, and day-90, compares it to category benchmarks, and identifies the single weakest link. That becomes the priority for your first experiment. surge-recon produces a baseline that compounds with every experiment cycle you run from it.

The AI Growth Engineer for Activation and Retention

Growth is the discipline that the rest of product work is ultimately in service of, and also the discipline most often practiced as ritual rather than science. A team that ships features without measuring activation is not doing product development; they are writing software. A team that runs A/B tests without pre-registered hypotheses is not experimenting; they are generating noise. A team with a retention problem that responds by building more features has inverted the causal chain: retention is downstream of value delivery, and building more features into a product whose users are not yet getting value from the features it already has is not growth work, it is scope expansion. AI growth tools that generate experiment ideas and onboarding copy without diagnosing the actual activation failure mode are the A/B test platform equivalent of a gym membership, they provide the infrastructure for the work without ensuring the work is being done in the right direction. Surge was built to do the diagnostic work first: to identify where growth is actually failing before designing interventions, and to design interventions that are specific enough to produce a clear result.

Why the generalist approach fails at growth

A generalist chatbot asked for growth advice produces a list of growth tactics, referral programs, email drip sequences, onboarding checklists, feature announcements, win-back campaigns. The list is internally consistent and would not surprise anyone who has read a growth marketing blog. What it is missing is the diagnosis: which of these tactics is relevant to the specific growth problem the team has, in what sequence, with what metrics to validate whether it is working. A team with a 28% activation rate does not have the same problem as a team with a 62% activation rate and a 40% day-30 retention rate. The first team needs to fix the first session; the second team has a week-two drop-off problem. Applying the same list of tactics to both produces mediocre results for both, because the tactics were not chosen for the actual failure mode.

Growth hacking playbooks and books, the Reforge course content, the Brian Balfour frameworks, the Andrew Chen essays, are excellent for understanding growth theory and learning the vocabulary. They are not designed to be applied directly to a specific product's specific growth problem. They describe what worked at Airbnb, Dropbox, Pinterest, and Slack. Those case studies are instructive but not prescriptive. The product-led growth loop that worked for Dropbox (the refer-a-friend for storage space mechanic) worked because Dropbox was a storage product where more storage was the core value and sharing was native to the use case. Applying the same loop to a B2B SaaS product with a six-person buying committee and an annual contract produces a broken mechanic, not viral growth. Growth theory requires translation into the specific product context before it produces interventions that work.

A/B test platforms alone, Optimizely, Statsig, LaunchDarkly, provide the infrastructure to run experiments without providing the judgment to design experiments that will produce useful results. An experiment without a pre-registered hypothesis, a minimum detectable effect calculation, and a decision rule for null results is not an experiment, it is a change with monitoring. The platform records the data. The interpretation remains ambiguous. Teams running experiments this way generate a backlog of "inconclusive" results that nobody knows how to act on, while the actual growth problems remain unaddressed because the experiments were not designed to answer a specific question. The infrastructure is necessary; it is not sufficient.

What a growth engineer actually does

A senior growth engineer starts with the current growth equation, the quantified model of how the product acquires, activates, retains, and monetizes users, and identifies the weakest link. Not the most exciting feature to test, but the step in the user journey where the highest percentage of users are failing to get value and leaving. That diagnostic is the most important work in growth, and it is also the most frequently skipped: it requires data analysis, user research synthesis, and the humility to work on boring problems (like an email confirmation flow that loses 30% of signups) rather than exciting ones (like a viral loop that might double acquisition). Once the weakest link is identified, the growth engineer designs experiments that are specific enough to test a single hypothesis, powered correctly to detect a real effect, and structured with a kill condition so that failed experiments produce a clear learning rather than a lingering feature.

The PLG (product-led growth) dimension of growth engineering is particularly nuanced. PLG is not a growth tactic, it is a distribution model where the product itself drives acquisition and expansion through the value it creates in use. Building PLG loops requires understanding where in the product's workflow natural sharing or network effects occur, what the "aha moment" is for the product (the specific moment when a new user gets the value that makes them retain), and how to reduce the time from signup to that moment. These are product design questions as much as growth questions, which is why the best growth engineers work at the intersection of product design, data analysis, and marketing, not within any single one.

Meet Surge

Surge is Tonone's dedicated AI growth engineer, a purpose-built agent for the full growth workflow, from activation funnel diagnosis through experiment design, retention playbook production, and PLG loop mapping. It starts with the diagnostic question: where is growth actually failing, and what is the specific intervention that has the highest probability of fixing it? From that diagnosis, it produces a prioritized experiment queue, a retention playbook, or a PLG loop design, with the specificity and rigor that turns growth work from a list of tactics into a learning system. Surge does not produce generic growth advice; it produces growth interventions calibrated to the specific failure mode it diagnosed.

Tonone's Surge is the AI growth engineer that diagnoses the specific activation or retention failure mode before designing interventions, producing experiments with pre-registered hypotheses, kill conditions, and clear decision rules.

What Surge actually does

Diagnosing activation failure and designing the fix

The surge-activation skill takes the current activation funnel, described or instrumented, and produces a diagnostic analysis with a prioritized experiment queue. The output identifies: the aha moment for the product (the specific user action or outcome that is most correlated with long-term retention), the time-to-aha for the current onboarding flow (how long it takes the average new user to reach the aha moment), the steps in the onboarding flow that have the highest drop-off, and the root cause hypothesis for each major drop-off point. From that diagnosis, it produces three to five experiments in priority order: each one targeting a specific drop-off hypothesis, designed with a hypothesis statement, the expected effect and reasoning, a minimum sample size, the guardrail metrics to monitor, and a kill condition (the signal that would indicate the experiment is causing harm before the full sample is collected). The experiments are sequenced so that each one builds on the learning from the previous: the sequence is a learning roadmap, not just a list of ideas. surge-activation is the skill that answers the question "what should our growth team work on next" with a specific, evidence-grounded answer rather than a prioritization debate.

SurgeGrowth

/surge-activation

Maps the activation funnel from first use to the moment the product delivers its core value (the aha moment). Identifies each step in the journey, measures drop-off rates at each step, and produces an ordered list of activation improvements ranked by expected impact and implementation effort.

When new user activation rates are below benchmark…

Building retention playbooks for at-risk users

The surge-retention skill produces a retention playbook: the behavioral signals that predict churn (leading indicators, not the churn event itself), the intervention triggers calibrated to each signal, the intervention content for each trigger (the email, in-app message, or success call script that addresses the specific disengagement pattern), and the success metric for each intervention. A retention playbook is not a drip sequence, it is a structured response to specific behavioral patterns. A user who has not returned in seven days after onboarding has a different disengagement signal than a user who returned daily for three weeks and then went quiet. The first needs a re-engagement message that addresses the obstacle to their return; the second needs an investigation into what changed. surge-retention produces playbooks that distinguish between these patterns and designs the right intervention for each one. The output also includes a cohort analysis structure: the behavioral segments that have different retention curves, the product changes that moved retention for each segment in the past, and the experiments queued for each segment based on the current gap between actual and target retention. For teams that have never formalized retention work, surge-retention produces a baseline playbook that can be implemented in a week and iterated from there.

SurgeGrowth

/surge-retention

Analyzes retention curves to identify when and why users churn: examines cohort retention by acquisition channel and behavior segment, maps the activities that correlate with retention versus churn, and produces a playbook with specific re-engagement trigger logic, win-back sequence design, and habit loop recommendations.

When retention is below benchmark and you need to …

Designing growth experiments with rigor

The surge-experiment skill produces a complete growth experiment specification: the hypothesis (what change, affecting which user behavior, producing what outcome), the variant description, the primary metric and measurement method, the minimum detectable effect and sample size, the test duration, guardrail metrics, and a kill condition (the specific signal that triggers early stopping). Every surge-experiment output includes a "what we'll learn" section, what the team will know if the experiment wins, what they will know if it loses, and what they will know if it produces a null result. This section prevents the most common experiment failure mode: a null result that produces no learning because the hypothesis was not specific enough to be invalidated. The experiment spec is also designed to be filed as a record: when the team reviews the experiment backlog six months later, they can understand exactly what was tested, why, and what was learned, building the institutional knowledge that compounds over time. surge-experiment connects directly to Lumen's lumen-abtest for the statistical design, ensuring the experimental rigor is consistent across the growth and product analytics workflows.

SurgeGrowth

/surge-experiment

Designs a single growth experiment with rigor: states the hypothesis in testable form, specifies the exact metric being tested and how it is measured, establishes a current baseline, calculates minimum detectable effect, determines required run time, and defines explicit kill conditions. One clear, unambiguous design per experiment.

Before running any growth experiment to ensure it …

Mapping product-led growth loops

The surge-plg skill maps the natural PLG loops in a product: the points in the workflow where sharing, collaboration, or network effects occur naturally, the current conversion rates through those loops, the friction points that prevent the loops from closing, and the experiments that would improve loop efficiency. For products where PLG is a viable distribution model, surge-plg produces a loop architecture: the acquisition loop (how users invite others into the product), the engagement loop (how usage deepens over time), and the expansion loop (how team or account growth occurs through usage). Each loop includes the mechanic, the current state, the target state, and a prioritized set of interventions to move from current to target. For products where PLG is less viable (high price point, complex buying process, enterprise contracts), surge-plg identifies the limited PLG elements that can still be built into the product, viral features, shareable outputs, referral mechanics, and prioritizes them by expected impact on acquisition cost.

SurgeGrowth

/surge-plg

Designs a product-led growth motion: self-serve onboarding flow that delivers value without a sales rep, freemium tier definition with clear limits that motivate upgrade, upgrade trigger design based on usage signals, in-product virality mechanics, and the expansion revenue path from free to paid. Grounded in what the product actually does.

When adding a free tier or self-serve motion to a …

Growth intelligence before experiments begin

The surge-recon skill performs a rapid growth audit before any experiment or playbook work begins: it maps the current growth equation (acquisition rate, activation rate, retention at day-7, day-30, and day-90, expansion rate), identifies the weakest link in the funnel, and produces a growth health brief. The brief includes: the current state of each growth lever, the benchmarks for the product's category, the biggest gap between actual and benchmark performance, and the single most important growth decision to make before the next experiment cycle begins. surge-recon also audits the current experiment backlog (if one exists) for quality: experiments with insufficient power, experiments without kill conditions, experiments testing multiple variables simultaneously, and experiments queued out of causal sequence. For teams new to structured growth work, surge-recon is the right entry point: it produces a baseline assessment and a clear priority for the first experiment, preventing the most common startup failure mode of running experiments in random order rather than in the sequence that produces compounding learning.

SurgeGrowth

/surge-recon

Audits the current growth state: maps existing acquisition channels with estimated contribution and cost, reads onboarding flows step by step, analyzes retention curve shape by cohort, and inventories any prior growth experiments with their outcomes. Produces a factual picture of where growth is coming from and where it is leaking.

Before designing new growth initiatives to underst…

A worked example

A B2B SaaS team has a 24% day-7 activation rate and does not know why. They have signup data, a basic funnel in Amplitude, and a sense that the onboarding flow is "too long." They run surge-recon first, which identifies that the actual problem is not length, users who reach step 4 of the onboarding (connecting an integration) have a 71% day-7 activation rate, while users who skip step 4 have a 9% rate. The recon output recommends the first experiment: make step 4 mandatory rather than optional.

Then surge-activation produces the full experiment queue. An excerpt of the activation funnel spec looks like this:

markdown

## Surge, Activation Funnel Spec

### Current state
Signup → Email confirm (loss: 18%) → Profile setup (loss: 12%)
→ Integration step (loss: 61%) → First core action (loss: 44%)

### Aha moment
Connecting first integration + completing first core action within session.
Evidence: 71% day-7 retention for users who hit both vs. 9% for those who skip.

### Experiment queue (priority order)

**EXP-01: Make integration step mandatory**
  Hypothesis: removing the skip option will increase integration completion
  rate from 39% to ≥65%, driving day-7 activation uplift ≥ +15pp.
  Primary metric: integration completion rate.
  Sample: 1,200 signups/arm. Duration: ~12 days.
  Kill condition: signup completion rate drops by >5pp (users abandoning at
  integration step rather than completing).
  What we'll learn:
  - WIN: step 4 completion rate drives activation; optimize integration UX next.
  - LOSS: skip behavior is not the cause; investigate email confirm drop-off.
  - NULL: effect is real but smaller; test integration pre-fill (EXP-02).

**EXP-02: Pre-fill integration from OAuth on signup**
  Hypothesis: reducing integration friction via OAuth pre-fill will reduce
  time-to-integration by 60% and increase completion rate ≥ 15pp.
  Sequences after EXP-01 (tests friction, not optionality).

**EXP-03: Email confirm removal (magic link direct to onboarding)**
  Hypothesis: eliminating the email confirmation step will reduce signup-to-
  onboarding drop by 15pp for signups from non-enterprise domains.
  Risk: deliverability and spam implications, route to security review first.

The team runs EXP-01. Integration completion jumps from 39% to 61%. Day-7 activation moves from 24% to 38%. The kill condition was not triggered. Because the hypothesis was specific and the decision rule was pre-registered, the team knows exactly what they learned: mandating the integration step was not the whole fix (completion went to 61%, not 65%), which means EXP-02 (reducing friction for users who want to complete it but cannot easily) is the right next experiment. The experiment sequence is a learning roadmap, not a list of features. Three experiments in, the team has a causal model of their activation funnel that they did not have before, and that model will compound into better interventions with every experiment cycle.

Before designing your first growth experiment, run surge-recon to identify the weakest link in your current funnel. The most common growth mistake is running experiments that optimize an already-working step while the actual failure mode, a step losing 60% of users, goes unaddressed because it is less exciting to fix. Surge diagnoses first, then designs the experiment sequence that addresses the actual constraint.

Surge vs the alternatives

Surge is not a list of growth tactics and it is not an A/B test platform. It is the growth judgment that determines which experiments to run, in what sequence, with what rigor, and the diagnostic work that ensures experiments are addressing the actual growth constraint rather than the most interesting hypothesis. The comparison below makes the functional differences concrete.

Tonone's Surge surge-retention skill produces retention playbooks that distinguish between disengagement patterns and design a specific intervention for each, not a generic drip sequence applied to all churning users.

Capability	Tonone	Generalist chatbot	Cursor / Copilot
Activation failure diagnosis before experiment design	Yes, aha moment identification, drop-off root cause, experiment queue in causal sequence	Tactic list without diagnosis, not calibrated to the actual failure mode	Growth frameworks without product-specific diagnosis
Retention playbooks by disengagement pattern	Yes, behavioral signal triggers, intervention content for each pattern, success metrics	Generic re-engagement advice, not calibrated to specific churn signals	Describes frameworks, not product-specific playbooks
Experiments with kill conditions and pre-registered rules	Yes, kill condition, what-we'll-learn for win/loss/null, decision rule	Experiment ideas without rigor, no power calculation, no kill condition	Test infrastructure, no experiment design or decision framework
PLG loop mapping and optimization	Yes, acquisition/engagement/expansion loop architecture with current vs. target state	PLG concepts and examples, not mapped to the specific product	Not applicable, general growth content, not product-specific analysis
Growth health audit with benchmark comparison	Yes, surge-recon maps the full growth equation and identifies the weakest link	No, no quantitative growth audit	Category averages only, no product-specific analysis
Experiment sequence as learning roadmap	Yes, experiments sequenced so each builds on the previous learning	Unsequenced ideas, no causal ordering	Platform runs tests in order received, no strategic sequencing

Tonone's Surge surge-plg skill maps acquisition, engagement, and expansion loops in the specific product context, identifying where PLG mechanics can be built and what current friction is preventing the loops from closing.

Install and try

Tonone is free and MIT-licensed. Install it once and all 23 agents, including Surge, are available in your Claude Code session. You pay only for the Claude Code token usage during work. Start with surge-recon to map your current growth equation and identify the weakest link before designing your next experiment.

1. Add to marketplace

$ claude plugin marketplace add tonone-ai/tonone

2. Install Surge

$ claude plugin install surge@tonone-ai

Frequently asked questions

What does Tonone's Surge do?: Surge is Tonone's AI growth engineer. It diagnoses activation failure modes, builds retention playbooks by disengagement pattern, designs growth experiments with pre-registered hypotheses and kill conditions, maps PLG loops for specific products, and audits the full growth equation to identify the weakest link before any work begins.
What is the aha moment and how does Surge identify it?: The aha moment is the specific user action or outcome most correlated with long-term retention, the moment when a new user gets the value that makes them come back. Surge's surge-activation skill identifies it by correlating early-session behaviors with long-term retention data, then designs the onboarding flow to minimize time-to-aha.
What is a kill condition and why does Surge include one in every experiment?: A kill condition is a pre-specified signal that triggers early stopping of an experiment if the variant is causing harm before the full sample is collected. Examples include a drop in signup completion rate, an increase in support ticket volume, or a degradation in a guardrail metric. Surge includes kill conditions in every experiment spec to prevent experiments from running to statistical significance while causing undetected damage.
How is Surge different from using a growth playbook or Reforge content?: Growth playbooks and courses describe frameworks and case studies from specific companies. Surge applies growth principles to your product's specific growth equation, your aha moment, your funnel drop-off points, your retention cohorts. The output is interventions calibrated to your context, not frameworks you need to translate yourself.
Can Surge help with product-led growth if my product has a complex B2B buying process?: Yes. surge-plg is designed for this nuance. For products where full PLG is not viable (enterprise contracts, complex buying committees), it identifies the limited PLG mechanics that can still be built in, shareable outputs, viral features, referral mechanics, and prioritizes them by expected impact on acquisition cost. Not every product can be Dropbox, but almost every product has some PLG surface.
What is the difference between Surge and Lumen for experiment design?: Lumen's lumen-abtest skill handles the statistical design of A/B tests, power calculations, sample sizes, and decision rules for product analytics experiments. Surge's surge-experiment skill handles the growth strategy layer, what to test, why, in what sequence, with what kill conditions, and connects to Lumen for the statistical rigor. They work best together.
Is Tonone's Surge free?: Yes. Tonone is MIT-licensed and free to use. Surge is one of 23 agents included in the Tonone package. You pay only for Claude Code token usage during the work itself.
How do I start with Surge if I have never done structured growth work?: Start with surge-recon. It maps your current growth equation, acquisition rate, activation rate, retention at day-7, day-30, and day-90, compares it to category benchmarks, and identifies the single weakest link. That becomes the priority for your first experiment. surge-recon produces a baseline that compounds with every experiment cycle you run from it.

Pairs well with

Lumen

The metrics that matter, the funnels that leak, and the cohorts no one noticed were churning.

Lens

Dashboards and metrics that drive decisions, not just fill screens.

Crest

What to build, in what order, and why: with frameworks that survive board-level scrutiny.

Machine-readable version →Raw JSON →Install Surge