How Sherpa dynamically optimizes your multivariate campaigns

This article builds on our first part where we introduced a Bayesian framework and arrived at an intuitive “Chance to beat” metric to analyze result of experiments. We strongly recommend you to go through the article before going further on this.

In this article we will cover:

Static A/B testing and why we need to level up

- Introducing Content Optimization powered by Sherpa


Static A/B Testing :- Allocation of users is fixed at beginning of experiment

Conventionally A/B testing has worked in a way that you create some fix number of variations and randomly allocate some static percentage of users in them. Consider following points in such a setup:

  1. Some variations will perform good and some bad in terms of conversions or click through rates. But percentages are fixed throughout the experiment and significant number of users might receive variations that are not performing well.
  2. It requires manual inspection from marketers time to time to infer which type of variations perform well and again create optimized campaigns.
  3. Further Smart Triggers and periodic campaigns run for long time, in some cases for months. Such manual manipulation from time to time might not be possible.
  4. A/B testing is meant for strict experiments where focus is on statistical significance and hypothesis testing. Whereas we want to continuous optimization where focus is on maintaining higher average conversion/click rate.

Multi-Arm Bandits

Introducing here a class of artificial intelligence algorithms called Multi-arm Bandits whose name is derived from slot machines in casinos. Lets first understand the analogy between A/B testing and slot machines.

You are in a casino and there are many different slot machines (bandits) each with a lever. You do not know underlying frequency of payoffs of each of these machines. You have some limited number of chances and goal is to maximize the rewards. How do we learn which machine gives best payoffs? Similarly in an experiment/ campaign we have multiple variations of which best need to be learned. We have limited sample size. Rewards are in terms of better click/conversion rate.



Finding best variation or best slot machine puts us under explore-exploit dilemma.

In Static A/B testing you have simply hardcoded allocation to each variation initially. When you start this experiment exploration phase begins. After manual intervention a marketer might be able to infer which variation is performing better and recreate the campaign with optimized parameters. The new optimized campaign will be exploitation phase. A/B testing is thus full exploration and then full exploitation. This discrete jump from exploration to exploitation is a drawback. What time should be spent in exploration and what time in exploitation ? The dilemma still remain !!

Bandit class of algorithms are answer for these problems.. Instead of two distinct periods of pure exploration and pure exploitation, bandit tests are adaptive, and simultaneously include exploration and exploitation. Bottom line is that all the Bandit class of algorithms are simply trying to best balance exploration (learning) with exploitation (go for based on current best information). Our implementation of explore-exploit strategy is based on Bayesian bandits.

Bayesian Bandit Algorithms

Summarizing our first part, we introduced a bayesian framework where CTR/CVR is thought in terms of a probability distribution called Beta distribution which represents our belief on the basis of sample size. When we have modeled each variation in our experiment using a beta distribution we arrived at chance of beating. At a given point of time we always know what is probability that a particular variation is best among all other variations.

Our bandit algorithm takes this as a base and explore-exploit based on this “chance of beating”. Let’s take an example with 3 variations:-  (A, B, C)

When the experiment begins we have no information, we do not know true underlying click through rates. So all variations have equal probability of winning

A : Chance of beating all 33%

B : Chance of beating all 33%

C : Chance of beating all 33%

As the campaign progresses we will start observing impressions and clicks in each of the variations and we will recalculate chance of beating all.

A : Impressions 100  Clicks 15  CTR 15%  Chance of beating all 3%

B : Impressions 100  Clicks 20  CTR 20%  Chance of beating all 19%

C : Impressions 100  Clicks 25  CTR 25%  Chance of beating all 78%

At this point we continue campaign with new percentages we have arrived at. C looks like a winner and gets 78% allocation. This is exploitation. But rest 22% is going to A & B which is exploration. In this way we keep recomputing Chance to beat and allocation keeps changing continuously.

Introducing Content Optimization powered by Sherpa


We here at MoEngage have been consistently focusing on building solutions which can automate and bring optimization to delivery of campaigns. To solve the drawbacks (mentioned above) of static multivariate experiment and to maximize your campaign engagement on the go, we have launched Content Optimization - message variation with chances of highest interaction is intelligently predicted on the fly and sent to users to maximize the engagement.

Create campaign using Content Optimization

While creating any multi-variate campaign, you can choose to set the distribution manually or Let Sherpa (our ML associate) do it for you. Sherpa will dynamically optimize the variation distribution to maximize the campaign CTRs.


Sherpa is most effective for Active push campaigns i.e. Periodic and Smart Trigger campaigns but adds a great value to general push campaigns as well.

Note: To ensure an efficient exploration, all the general push campaigns created using content optimization will be sent over a duration of 60 mins or throttling period that you have chosen, whichever is higher. Wondering what is throttling?

For campaigns using Content Optimization tests in their campaigns:

For the campaigns created using Dynamic A/B Testing powered by Sherpa, marketers will see three additional metrics.

1. Projected CTR: Calculated as average CTR of message variations assuming users were equally split (e.g. 50:50 for two variations or 33:33:33 for three variations) across variations

2. Final CTR: Calculated as ratio of total clicks and total impressions across the variations. Same as campaign CTR

3. CTR Improvement: The improvement in CTR resulted due to usage of Sherpa powered Content Optimization calculated as:


Sample snapshot of campaign metrics for your reference. In this campaign, Variation 2 was allocated 82% of the users vs. only 18% for Variation 1 because of higher CTR of Variation 2 in exploration phase. Leveraging this opportunity, Sherpa improved your CTR from 21.21 % to 25.70 % i.e. an improvement of 21.17% or 4.5 percentage points.


*CTR is Click Through Rate

How are we doing this?

Content optimization is using Bayesian Bandit Algorithms. Bayesian Bandits brings efficiency in delivery because we move traffic towards winning variations gradually, instead of forcing you to wait for a “final answer” at the end of an experiment. This will be faster because samples that would have gone to obviously inferior variations can be assigned to potential winners. The extra data collected on the high-performing variations can help separate the “good” arms from the “best” ones more quickly. Bandit method always leave some chance to select the poorer performing option, you give it a chance to ‘reconsider’ the option effectiveness. It provides a working framework for swapping out low performing options with fresh options, in a continuous process.


Was this article helpful?
6 out of 8 found this helpful