Experiments

Experiments let you split users into control and test groups to compare different intervention strategies. Retivo handles assignment, tracking, and statistical analysis — experiments auto-conclude when results are significant.

How It Works

Create Experiment (playbook + variant config)
     │
     ├── Start → Users randomly assigned
     │            ├── Control: playbook defaults
     │            └── Test: overridden strategy_hints
     │
     ├── Outcomes tracked per variant
     │
     └── Auto-concludes when p < 0.05 (two-proportion z-test)

Pick a playbook to test
Define what's different in the test variant (channel, tone, timing, etc.)
Set the traffic split (e.g., 50/50)
Start the experiment — Retivo assigns users deterministically
Results update daily. When statistical significance is reached, Retivo auto-concludes and declares a winner

Create an Experiment

curl -X POST https://retivo.ai/api/experiments \
  -H "Authorization: Bearer rt_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "playbook_id": "pb_abc123",
    "name": "Re-engagement: email vs in-app",
    "variant_config": {
      "control": { "weight": 0.5 },
      "test": {
        "weight": 0.5,
        "strategy_hints_override": {
          "preferred_channel": "in_app",
          "tone": "casual"
        }
      }
    }
  }'

Response:

{
  "id": "exp_xyz789",
  "name": "Re-engagement: email vs in-app",
  "status": "draft"
}

Variant Config

Field	Type	Description
`control.weight`	number (0-1)	Fraction of users in control group
`test.weight`	number (0-1)	Fraction of users in test group
`test.strategy_hints_override`	object	Strategy hints that override the playbook defaults for the test group

Weights must sum to 1.0. Common splits: 50/50, 70/30 (when you want to limit exposure to the test variant).

What You Can Test

The strategy_hints_override can change any field the decision engine reads:

Override	What it tests
`preferred_channel: "in_app"`	Email vs in-app delivery
`tone: "casual"`	Formal vs casual message tone
`focus: "highlight new features"`	Different messaging strategy
`cooldown_hours: 24`	Contact frequency

Start an Experiment

Experiments start in draft status. Start when ready:

curl -X PUT https://retivo.ai/api/experiments/{id}/start \
  -H "Authorization: Bearer rt_live_..."

Once running, every user evaluated against the associated playbook is deterministically assigned to a variant using SHA256(experiment_id:user_id). The same user always gets the same variant.

View Results

curl https://retivo.ai/api/experiments/{id}/results \
  -H "Authorization: Bearer rt_live_..."

Response:

{
  "experiment": {
    "id": "exp_xyz789",
    "name": "Re-engagement: email vs in-app",
    "status": "running",
    "winner": null
  },
  "assignments": {
    "control": 142,
    "test": 138
  },
  "outcomes": [
    { "variant": "control", "outcome_type": "positive", "count": 45 },
    { "variant": "control", "outcome_type": "negative", "count": 28 },
    { "variant": "control", "outcome_type": "neutral", "count": 12 },
    { "variant": "test", "outcome_type": "positive", "count": 58 },
    { "variant": "test", "outcome_type": "negative", "count": 22 },
    { "variant": "test", "outcome_type": "neutral", "count": 15 }
  ]
}

Understanding Results

Assignments: How many users were placed in each variant
Outcomes: Positive/negative/neutral outcomes per variant, tracked over a 7-day attribution window
Positive rate: positive / (positive + negative + neutral) per variant
Lift: (test_rate - control_rate) / control_rate × 100%

Auto-Conclusion

Retivo's daily learning cron analyzes running experiments using a two-proportion z-test. When the p-value drops below 0.05, the experiment auto-concludes:

Winner declared: The variant with the higher positive outcome rate
Playbook updated: If the test variant wins, its strategy_hints_override is promoted to the playbook's default hints (via the tuning log)
Status: Changes to concluded

You can also manually cancel an experiment:

curl -X PUT https://retivo.ai/api/experiments/{id}/cancel \
  -H "Authorization: Bearer rt_live_..."

Experiment Lifecycle

draft ──► running ──► concluded (auto, when significant)
  │          │
  └──► cancelled ◄──┘ (manual)

Status	Description
`draft`	Created but not active. No users assigned yet.
`running`	Active. Users being assigned and outcomes tracked.
`concluded`	Statistically significant result found. Winner declared.
`cancelled`	Manually stopped. No winner declared.

Dashboard

Experiments can also be managed from the dashboard at Insights → Experiments. The UI provides:

Create experiments with a visual form
Start/cancel with one click
Live results with outcome comparison bars and lift calculation
Winner badge when concluded

Best Practices

Run one experiment per playbook at a time. Multiple experiments on the same playbook will interfere with each other's results.
Wait for significance. Don't manually conclude experiments early — the auto-conclusion ensures the result is statistically valid.
Start with 50/50 splits. Unless you have a strong reason to limit exposure, equal splits reach significance faster.
Test one variable at a time. If you change channel AND tone simultaneously, you won't know which change drove the result.
Minimum sample size. Experiments typically need 50-100 outcomes per variant to reach significance, depending on effect size.

Experiments

On this page