Digital Ad Cost Prediction using Bayesian Regression in ML

FREE Online Courses: Transform Your Career – Enroll for Free!

Marketers and media buyers need to forecast the cost-per-click (CPC) of a digital advertising campaign—before launching or adjusting bids—using early campaign indicators such as daily budget, impressions, click-through rate (CTR), quality score, and device mix. CPC often exhibits nonlinear relationships (e.g., higher CTRs can lower average CPC, while limited budgets can push bids up) and is subject to uncertainty from auction dynamics and competitor behaviour. A single point‐estimate model hides this uncertainty, risking overspend or underdelivery. By applying Bayesian Regression, we obtain both:

1. A point forecast of expected CPC.

2. A credible interval quantifying our uncertainty—enabling data‑driven bid strategy and budget allocation.

Libraries Required

import pandas as pd                              # data loading & manipulation  
import numpy as np                               # numerical operations  

import matplotlib.pyplot as plt                  # plotting  
import seaborn as sns                            # visualization  

import pymc3 as pm                               # Bayesian modeling  
import arviz as az                               # posterior analysis  

from sklearn.model_selection import train_test_split  
from sklearn.preprocessing import StandardScaler  
from sklearn.metrics import mean_absolute_error

Dataset

PPC Campaign Performance Data

Step-by-Step Code Implementation

Data Loading & Preprocessing

We load PPC metrics: Cost (daily spend), Impressions, CTR, and derive a Quality_Score proxy if missing
StandardScaler zero‑means and unit‑scales features so that Normal(0,1) priors on β apply uniformly.

import pandas as pd

# Load PPC campaign data
df = pd.read_csv("data/ppc-campaign-performance-data.csv")

# Compute quality score proxy if missing (e.g. CTR × 100)
if 'Quality_Score' not in df.columns:
    df['Quality_Score'] = df['CTR'] * 100

# Define features and target
# Features: daily budget (Cost), Impressions, CTR, Quality_Score, Device mix if present
features = ['Cost','Impressions','CTR','Quality_Score']
X = df[features].values
y = df['CPC'].values  # cost-per-click in USD

# Train/test split (80% train / 20% test)
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# Standardize features for stable MCMC sampling
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler().fit(X_train)
X_train_s = scaler.transform(X_train)
X_test_s  = scaler.transform(X_test)

Define & Fit Bayesian Regression Model

Model Priors:

α ∼ Normal(0, 1): broad intercept prior on standardised CPC.
β ∼ Normal(0, 1): weakly informative priors on each feature’s standardised effect.
σ ∼ HalfNormal(1): positive residual noise scale.

Likelihood: Observed CPC ∼ Normal(α + β·X_std, σ).

MCMC Sampling: We draw 2,000 posterior samples (after 1,000 tuning) with target_accept=0.9 to ensure stable convergence.

import pymc3 as pm

with pm.Model() as ppc_model:
    # Priors for intercept α and coefficients β
    α = pm.Normal("α", mu=0, sigma=1)                            # intercept prior
    β = pm.Normal("β", mu=0, sigma=1, shape=X_train_s.shape[1]) # slopes prior
    σ = pm.HalfNormal("σ", sigma=1)                              # residual noise

    # Linear predictor (standardized features)
    μ = α + pm.math.dot(X_train_s, β)

    # Likelihood: observed CPC
    Y_obs = pm.Normal("Y_obs", mu=μ, sigma=σ, observed=y_train)

    # Sample posterior
    trace = pm.sample(
        draws=2000,           # number of posterior draws
        tune=1000,            # burn‑in samples
        target_accept=0.9,    # increase for stable sampling
        return_inferencedata=True
    )

Posterior Analysis & Point Predictions

Posterior Predictive: Sampling Y_obs yields full predictive distributions; we extract the posterior mean forecast and 94% Highest Posterior Density intervals for new CTR values.
Evaluation: Mean Absolute Error (MAE) on the held‑out test set quantifies the average CPC-forecasting error in USD.

import arviz as az
from sklearn.metrics import mean_absolute_error

# Summarize posterior distributions
az.summary(trace, round_to=2)

# Posterior predictive sampling
with ppc_model:
    ppc = pm.sample_posterior_predictive(trace, var_names=["Y_obs"])

# Extract posterior means of α and β
α_post = trace.posterior["α"].mean().item()
β_post = trace.posterior["β"].mean(dim=["chain","draw"]).values

# Compute point predictions on the test set
y_pred = α_post + X_test_s.dot(β_post)

# Evaluate Mean Absolute Error
mae = mean_absolute_error(y_test, y_pred)
print(f"Test MAE: ${mae:.2f} per click")

Visualise Predictions & Credible Intervals

By sweeping CTR (a key bid driver) and holding other features at their median, we plot both the posterior mean CPC curve and its credible band—revealing how higher CTRs typically reduce CPC and how confident we are in those estimates.

import numpy as np
import matplotlib.pyplot as plt

# Sweep CTR while holding other features at median
ctr_grid = np.linspace(X_train_s[:,2].min(), X_train_s[:,2].max(), 100)
grid = np.median(X_train_s, axis=0)[None,:].repeat(100, axis=0)
grid[:,2] = ctr_grid

with ppc_model:
    ppc_grid = pm.sample_posterior_predictive(
        trace, var_names=["Y_obs"], samples=1000
    )

preds     = ppc_grid["Y_obs"]
mean_pred = preds.mean(axis=0)
hpd       = az.hdi(preds, hdi_prob=0.94)

# Back‑transform CTR to original scale
ctr_orig = scaler.inverse_transform(grid)[:,2]

plt.figure(figsize=(8,5))
plt.plot(ctr_orig, mean_pred, label="Posterior mean")
plt.fill_between(ctr_orig, hpd[:,0], hpd[:,1], alpha=0.3,
                 label="94% credible interval")
plt.scatter(
    scaler.inverse_transform(X_test_s)[:,2], y_test,
    color="k", alpha=0.5, label="Test data"
)
plt.xlabel("Click‑Through Rate (CTR)")
plt.ylabel("Cost‑Per‑Click (USD)")
plt.title("Bayesian Regression: CPC vs. CTR")
plt.legend()
plt.tight_layout()
plt.show()

Summary

This Bayesian Regression workflow for Digital Ad Cost Prediction provides:

1. Point estimates of cost‑per‑click from early campaign metrics.

2. Credible intervals quantifying uncertainty from auction dynamics and campaign variability.

3. Actionable insights: media‑buyers can set bids and budgets with explicit confidence bounds, optimise campaign parameters under uncertainty, and negotiate spend with greater risk awareness.

Did you like our efforts? If Yes, please give ProjectGurukul 5 Stars on Google | Facebook

Digital Ad Cost Prediction using Bayesian Regression in ML

Libraries Required

Dataset

Step-by-Step Code Implementation

Data Loading & Preprocessing

Define & Fit Bayesian Regression Model

Posterior Analysis & Point Predictions

Visualise Predictions & Credible Intervals