Early Access This is an Early Access feature. To enable it for your account, please contact your MoEngage Customer Success Manager (CSM) or the Support team. |
Overview
Campaign decisioning is a specialized application of artificial intelligence that autonomously determines the next best decision for an individual customer in real time. Unlike traditional "if-then" automation or static segments, the MoEngage Decisioning Agent is a self-learning orchestration system built on a "Predict-then-Decide" framework. By analyzing trillions of data points, the agent personalizes the content, channel, and frequency of every interaction at a 1:1 scale to maximize campaign ROI (Return on Investment) and long-term business value.
Agent Components
To function effectively, every agent is built upon the following foundational pillars:
- The reward: Every agent uses a Composite Reward Function, which serves as the mathematical definition of success. This function models user values as a weighted synthesis of events. The agent's intelligence is dedicated to a single purpose: maximizing this total reward over a user's lifetime.
- Guardrails: While the AI is autonomous, you must provide supervision. Marketers set guardrails to ensure brand safety and a positive user experience. These controls include settings for the target audience, frequency caps, allowed exploration limits for experimentation, and agent control groups. The agent continuously optimizes performance strictly within these boundaries.
Decisioning Intelligence Process
When a user becomes eligible for engagement, the agent’s intelligence engine executes the following process:
- Observation: The agent gathers the current state of the user to model intent. It analyzes user properties, aggregated behavior (such as purchase frequency and average order value), and the semantic similarity of campaigns. This calculates how closely campaign content aligns with the evolving interests of both the individual and their lookalike audiences.
- Prediction: Leveraging a Multi-Task Learning architecture, the agent forecasts multiple outcomes in parallel. It moves beyond simple binary triggers to calculate individual probabilities of reward events for positive engagement, potential churn, or fatigue.
- Decision: The agent ranks all campaign options by aggregating the probabilities of target reward events into a single Composite Reward Score. It evaluates each option against the current context to identify top candidate campaigns that adhere to your guardrails. This phase also runs strategic exploration, particularly for "cold" users or new assets, allowing the agent to discover new strategies rather than relying solely on historical patterns.
Optimization: The agent closes the loop by gathering outcomes from real-world execution. This feedback reinforces result-yielding strategies and pivots away from underperforming ones, ensuring the agent continuously adapts its logic to maximize ROI.
Decision-Making Mechanisms
The agent learns and adapts over time using the following mechanisms:
- Reinforcement learning: Unlike static models that require manual retraining, the agent learns through reinforcement. Every interaction, including a click, a purchase, or a dismissal, serves as a feedback signal. The agent uses Incremental Training to update its parameters in real time. If a decision leads to a reward, the strategy is reinforced; if not, the agent adjusts immediately.
- Contextual Multi-Armed Bandits (CMAB): The decision-making core is powered by CMAB. Unlike standard models that predict a fixed outcome, CMAB serves as an adaptive policy engine that maps specific user context to the next-best decision. This prevents the agent from becoming stuck in a local optimum and ensures it automatically adapts as user behavior and market trends drift. CMAB optimizes its policies by managing the following trade-off:
- Exploitation: The agent uses what it knows to be the best-performing message for a user based on their current profile.
- Exploration: The agent intentionally tests "wildcard" options on small subsets of traffic to discover new patterns. This solves the "cold start" problem for new campaigns.
- Composite reward score: The agent uses a Multi-Task Learning architecture to predict each reward and calculates a unified Composite Reward Score for every decision using the following formula:
$Reward = (1.0 x Very good events) + (0.5 x Very good events) + (-0.5 x Bad events) + (-1.0 x Very bad events).
To calibrate this intelligence, you must classify user-positive and user-negative events into four categories: Very Good, Good, Bad, and Very Bad, based on their intent and business impact.
Human Roles in Agent Steering
The human role is critical in providing the high-level inputs that steer the AI:
- Strategic intent: You can define the AI's high-level objectives and translate complex business priorities, such as balancing acquisition volume with long-term margins, into the mathematical rewards the agent uses to learn.
- Governance: You can establish the non-negotiable boundaries—from message frequency and quiet hours to strict brand safety to ensure the AI honors the user relationship. This ensures that AI speed and scale do not compromise brand integrity.
- Creative judgment: While the agent handles scale, it relies on you to establish the core creative vision and human resonance. By providing a diverse set of creative messages, you allow the AI to strategically map the right value proposition to the users most likely to engage.