Retivo

Experiments

A/B test playbook strategies to find what works best for your users

Experiments let you split users into control and test groups to compare different intervention strategies. Retivo handles assignment, tracking, and statistical analysis — experiments auto-conclude when results are significant.

How It Works

Create Experiment (playbook + variant config)

     ├── Start → Users randomly assigned
     │            ├── Control: playbook defaults
     │            └── Test: overridden strategy_hints

     ├── Outcomes tracked per variant

     └── Auto-concludes when p < 0.05 (two-proportion z-test)
  1. Pick a playbook to test
  2. Define what's different in the test variant (channel, tone, timing, etc.)
  3. Set the traffic split (e.g., 50/50)
  4. Start the experiment — Retivo assigns users deterministically
  5. Results update daily. When statistical significance is reached, Retivo auto-concludes and declares a winner

Create an Experiment

curl -X POST https://retivo.ai/api/experiments \
  -H "Authorization: Bearer rt_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "playbook_id": "pb_abc123",
    "name": "Re-engagement: email vs in-app",
    "variant_config": {
      "control": { "weight": 0.5 },
      "test": {
        "weight": 0.5,
        "strategy_hints_override": {
          "preferred_channel": "in_app",
          "tone": "casual"
        }
      }
    }
  }'

Response:

{
  "id": "exp_xyz789",
  "name": "Re-engagement: email vs in-app",
  "status": "draft"
}

Variant Config

FieldTypeDescription
control.weightnumber (0-1)Fraction of users in control group
test.weightnumber (0-1)Fraction of users in test group
test.strategy_hints_overrideobjectStrategy hints that override the playbook defaults for the test group

Weights must sum to 1.0. Common splits: 50/50, 70/30 (when you want to limit exposure to the test variant).

What You Can Test

The strategy_hints_override can change any field the decision engine reads:

OverrideWhat it tests
preferred_channel: "in_app"Email vs in-app delivery
tone: "casual"Formal vs casual message tone
focus: "highlight new features"Different messaging strategy
cooldown_hours: 24Contact frequency

Start an Experiment

Experiments start in draft status. Start when ready:

curl -X PUT https://retivo.ai/api/experiments/{id}/start \
  -H "Authorization: Bearer rt_live_..."

Once running, every user evaluated against the associated playbook is deterministically assigned to a variant using SHA256(experiment_id:user_id). The same user always gets the same variant.


View Results

curl https://retivo.ai/api/experiments/{id}/results \
  -H "Authorization: Bearer rt_live_..."

Response:

{
  "experiment": {
    "id": "exp_xyz789",
    "name": "Re-engagement: email vs in-app",
    "status": "running",
    "winner": null
  },
  "assignments": {
    "control": 142,
    "test": 138
  },
  "outcomes": [
    { "variant": "control", "outcome_type": "positive", "count": 45 },
    { "variant": "control", "outcome_type": "negative", "count": 28 },
    { "variant": "control", "outcome_type": "neutral", "count": 12 },
    { "variant": "test", "outcome_type": "positive", "count": 58 },
    { "variant": "test", "outcome_type": "negative", "count": 22 },
    { "variant": "test", "outcome_type": "neutral", "count": 15 }
  ]
}

Understanding Results

  • Assignments: How many users were placed in each variant
  • Outcomes: Positive/negative/neutral outcomes per variant, tracked over a 7-day attribution window
  • Positive rate: positive / (positive + negative + neutral) per variant
  • Lift: (test_rate - control_rate) / control_rate × 100%

Auto-Conclusion

Retivo's daily learning cron analyzes running experiments using a two-proportion z-test. When the p-value drops below 0.05, the experiment auto-concludes:

  • Winner declared: The variant with the higher positive outcome rate
  • Playbook updated: If the test variant wins, its strategy_hints_override is promoted to the playbook's default hints (via the tuning log)
  • Status: Changes to concluded

You can also manually cancel an experiment:

curl -X PUT https://retivo.ai/api/experiments/{id}/cancel \
  -H "Authorization: Bearer rt_live_..."

Experiment Lifecycle

draft ──► running ──► concluded (auto, when significant)
  │          │
  └──► cancelled ◄──┘ (manual)
StatusDescription
draftCreated but not active. No users assigned yet.
runningActive. Users being assigned and outcomes tracked.
concludedStatistically significant result found. Winner declared.
cancelledManually stopped. No winner declared.

Dashboard

Experiments can also be managed from the dashboard at Insights → Experiments. The UI provides:

  • Create experiments with a visual form
  • Start/cancel with one click
  • Live results with outcome comparison bars and lift calculation
  • Winner badge when concluded

Best Practices

  • Run one experiment per playbook at a time. Multiple experiments on the same playbook will interfere with each other's results.
  • Wait for significance. Don't manually conclude experiments early — the auto-conclusion ensures the result is statistically valid.
  • Start with 50/50 splits. Unless you have a strong reason to limit exposure, equal splits reach significance faster.
  • Test one variable at a time. If you change channel AND tone simultaneously, you won't know which change drove the result.
  • Minimum sample size. Experiments typically need 50-100 outcomes per variant to reach significance, depending on effect size.

On this page