Marketing Campaign Cost Prediction using Bayesian Regression in ML

FREE Online Courses: Dive into Knowledge for Free. Learn More!

Marketing managers need to forecast the total cost of a digital media campaign—before finalising budgets—using early campaign metrics such as impressions, clicks, click‑through rate (CTR), engagement rate, and channel spend allocations (e.g. TV, radio, social media). Campaign costs exhibit nonlinear dependencies (e.g., diminishing returns in high-CTR segments and bulk-buy discounts on high-volume impressions). It is subject to uncertainty from bid-auction dynamics and creative performance. A single point estimate obscures this risk, potentially leading to under‑ or overspend. By applying Bayesian Regression, we obtain both:

1. A point forecast of total campaign cost.

2. A credible interval quantifying forecast uncertainty—enabling data‑driven budget allocation and risk management.

Libraries Required

import pandas as pd                              # data loading & manipulation  
import numpy as np                               # numerical operations  

import matplotlib.pyplot as plt                  # plotting  
import seaborn as sns                            # visualization  

import pymc3 as pm                               # Bayesian modeling  
import arviz as az                               # posterior analysis  

from sklearn.model_selection import train_test_split  
from sklearn.preprocessing import StandardScaler  
from sklearn.metrics import mean_absolute_error  

Dataset

Media Campaign Cost Prediction

Step-by-Step Code Implementation

Data Loading & Preprocessing

  • Data preparation: We load all campaign metrics, drop missing values, and designate Cost as our target.
  • Scaling: We z‑score predictors so that coefficient priors (Normal(0,1)) apply uniformly and MCMC converges stably.
import pandas as pd

# Load the campaign data
df = pd.read_csv("data/media-campaign-cost-prediction.csv")
df = df.dropna()

# Assume the CSV has a 'Cost' column and the rest are predictors
features = [c for c in df.columns if c != 'Cost']
X = df[features].values
y = df['Cost'].values  # USD total campaign cost

# Train/test split (80% train / 20% test)
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# Standardize predictors for stable MCMC sampling
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler().fit(X_train)
X_train_s = scaler.transform(X_train)
X_test_s  = scaler.transform(X_test)

Define & Fit Bayesian Regression Model

Priors:

  • α ∼ Normal(mean of costs, 2×std) centres the intercept on the typical cost scale.
  • β ∼ Normal(0, 1) expresses modest initial uncertainty per standardised predictor.
  • σ ∼ HalfNormal(std of costs) enforces positive residual noise.

Model: Observed Cost ∼ Normal(α + β·X_std, σ).

Sampling: We draw 2,000 posterior samples after 1,000 tuning steps, with target_accept=0.9 for robust convergence diagnostics.

import pymc3 as pm

with pm.Model() as campaign_cost_model:
    # Priors for intercept and coefficients
    α = pm.Normal("α", mu=np.mean(y_train), sigma=np.std(y_train)*2)
    β = pm.Normal("β", mu=0, sigma=1, shape=X_train_s.shape[1])
    σ = pm.HalfNormal("σ", sigma=np.std(y_train))

    # Expected cost linear predictor
    μ = α + pm.math.dot(X_train_s, β)

    # Likelihood
    Y_obs = pm.Normal("Y_obs", mu=μ, sigma=σ, observed=y_train)

    # Sample the posterior
    trace = pm.sample(
        draws=2000,
        tune=1000,
        target_accept=0.9,
        return_inferencedata=True
    )

Posterior Analysis & Point Predictions

  • We sample Y_obs to obtain full predictive distributions, from which we derive:
  • Posterior mean point forecasts, and
  • 94% Highest Posterior Density intervals for uncertainty bounds.

Evaluation: Mean Absolute Error (MAE) on the held‑out test set quantifies average forecast error.

import arviz as az
from sklearn.metrics import mean_absolute_error

# Summarize posterior distributions
az.summary(trace, round_to=2)

# Posterior predictive sampling
with campaign_cost_model:
    ppc = pm.sample_posterior_predictive(trace, var_names=["Y_obs"])

# Posterior means of parameters
α_post = trace.posterior["α"].mean().item()
β_post = trace.posterior["β"].mean(dim=["chain","draw"]).values

# Point predictions on the test set
y_pred = α_post + X_test_s.dot(β_post)

# Evaluate Mean Absolute Error
mae = mean_absolute_error(y_test, y_pred)
print(f"Test MAE: ${mae:.2f}")

Visualise Predictions & Credible Intervals

By sweeping a single predictor (e.g., impressions) and holding others fixed, we plot the posterior mean cost curve and its credible band—illuminating both expected cost sensitivity and the uncertainty around that relation.

import numpy as np
import matplotlib.pyplot as plt

# Sweep one key predictor (e.g., impressions) if it's the first column
grid_vals = np.linspace(X_train_s[:,0].min(), X_train_s[:,0].max(), 100)
grid = np.median(X_train_s, axis=0)[None,:].repeat(100, axis=0)
grid[:,0] = grid_vals

with campaign_cost_model:
    ppc_grid = pm.sample_posterior_predictive(
        trace, var_names=["Y_obs"], samples=1000
    )

preds     = ppc_grid["Y_obs"]
mean_pred = preds.mean(axis=0)
hpd       = az.hdi(preds, hdi_prob=0.94)

# Back‐transform the first predictor to original scale
orig_vals = scaler.inverse_transform(grid)[:,0]

plt.figure(figsize=(8,5))
plt.plot(orig_vals, mean_pred, label="Posterior mean")
plt.fill_between(orig_vals, hpd[:,0], hpd[:,1], alpha=0.3,
                 label="94% credible interval")
plt.scatter(
    scaler.inverse_transform(X_test_s)[:,0],
    y_test, color="k", alpha=0.5, label="Test data"
)
plt.xlabel(features[0])
plt.ylabel("Campaign Cost (USD)")
plt.title("Bayesian Regression: Cost vs. " + features[0])
plt.legend()
plt.tight_layout()
plt.show()

Summary

This Bayesian Regression workflow for Marketing Campaign Cost Prediction provides:

1. Accurate point forecasts of campaign costs from early performance metrics.

2. Credible intervals quantifying forecast uncertainty—essential for budget risk management.

3. Actionable insights: marketing leaders can allocate budgets, set bid strategies, and plan contingencies with explicit cost‑risk bounds—optimising both spend efficiency and campaign effectiveness.

Did we exceed your expectations?
If Yes, share your valuable feedback on Google | Facebook

ProjectGurukul Team

The ProjectGurukul Team delivers project-based tutorials on programming, machine learning, and web development. We simplify learning by providing hands-on projects to help you master real-world skills. We also provide free major and minor projects for enginering students.

Leave a Reply

Your email address will not be published. Required fields are marked *