Digital Ad Cost Prediction using Bayesian Regression in ML
FREE Online Courses: Dive into Knowledge for Free. Learn More!
Marketers and media buyers need to forecast the cost-per-click (CPC) of a digital advertising campaign—before launching or adjusting bids—using early campaign indicators such as daily budget, impressions, click-through rate (CTR), quality score, and device mix. CPC often exhibits nonlinear relationships (e.g., higher CTRs can lower average CPC, while limited budgets can push bids up) and is subject to uncertainty from auction dynamics and competitor behaviour. A single point‐estimate model hides this uncertainty, risking overspend or underdelivery. By applying Bayesian Regression, we obtain both:
1. A point forecast of expected CPC.
2. A credible interval quantifying our uncertainty—enabling data‑driven bid strategy and budget allocation.
Libraries Required
import pandas as pd # data loading & manipulation import numpy as np # numerical operations import matplotlib.pyplot as plt # plotting import seaborn as sns # visualization import pymc3 as pm # Bayesian modeling import arviz as az # posterior analysis from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler from sklearn.metrics import mean_absolute_error
Dataset
Step-by-Step Code Implementation
Data Loading & Preprocessing
- We load PPC metrics: Cost (daily spend), Impressions, CTR, and derive a Quality_Score proxy if missing
- StandardScaler zero‑means and unit‑scales features so that Normal(0,1) priors on β apply uniformly.
import pandas as pd
# Load PPC campaign data
df = pd.read_csv("data/ppc-campaign-performance-data.csv")
# Compute quality score proxy if missing (e.g. CTR × 100)
if 'Quality_Score' not in df.columns:
df['Quality_Score'] = df['CTR'] * 100
# Define features and target
# Features: daily budget (Cost), Impressions, CTR, Quality_Score, Device mix if present
features = ['Cost','Impressions','CTR','Quality_Score']
X = df[features].values
y = df['CPC'].values # cost-per-click in USD
# Train/test split (80% train / 20% test)
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)
# Standardize features for stable MCMC sampling
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler().fit(X_train)
X_train_s = scaler.transform(X_train)
X_test_s = scaler.transform(X_test)
Define & Fit Bayesian Regression Model
Model Priors:
- α ∼ Normal(0, 1): broad intercept prior on standardised CPC.
- β ∼ Normal(0, 1): weakly informative priors on each feature’s standardised effect.
- σ ∼ HalfNormal(1): positive residual noise scale.
Likelihood: Observed CPC ∼ Normal(α + β·X_std, σ).
MCMC Sampling: We draw 2,000 posterior samples (after 1,000 tuning) with target_accept=0.9 to ensure stable convergence.
import pymc3 as pm
with pm.Model() as ppc_model:
# Priors for intercept α and coefficients β
α = pm.Normal("α", mu=0, sigma=1) # intercept prior
β = pm.Normal("β", mu=0, sigma=1, shape=X_train_s.shape[1]) # slopes prior
σ = pm.HalfNormal("σ", sigma=1) # residual noise
# Linear predictor (standardized features)
μ = α + pm.math.dot(X_train_s, β)
# Likelihood: observed CPC
Y_obs = pm.Normal("Y_obs", mu=μ, sigma=σ, observed=y_train)
# Sample posterior
trace = pm.sample(
draws=2000, # number of posterior draws
tune=1000, # burn‑in samples
target_accept=0.9, # increase for stable sampling
return_inferencedata=True
)
Posterior Analysis & Point Predictions
- Posterior Predictive: Sampling Y_obs yields full predictive distributions; we extract the posterior mean forecast and 94% Highest Posterior Density intervals for new CTR values.
- Evaluation: Mean Absolute Error (MAE) on the held‑out test set quantifies the average CPC-forecasting error in USD.
import arviz as az
from sklearn.metrics import mean_absolute_error
# Summarize posterior distributions
az.summary(trace, round_to=2)
# Posterior predictive sampling
with ppc_model:
ppc = pm.sample_posterior_predictive(trace, var_names=["Y_obs"])
# Extract posterior means of α and β
α_post = trace.posterior["α"].mean().item()
β_post = trace.posterior["β"].mean(dim=["chain","draw"]).values
# Compute point predictions on the test set
y_pred = α_post + X_test_s.dot(β_post)
# Evaluate Mean Absolute Error
mae = mean_absolute_error(y_test, y_pred)
print(f"Test MAE: ${mae:.2f} per click")
Visualise Predictions & Credible Intervals
By sweeping CTR (a key bid driver) and holding other features at their median, we plot both the posterior mean CPC curve and its credible band—revealing how higher CTRs typically reduce CPC and how confident we are in those estimates.
import numpy as np
import matplotlib.pyplot as plt
# Sweep CTR while holding other features at median
ctr_grid = np.linspace(X_train_s[:,2].min(), X_train_s[:,2].max(), 100)
grid = np.median(X_train_s, axis=0)[None,:].repeat(100, axis=0)
grid[:,2] = ctr_grid
with ppc_model:
ppc_grid = pm.sample_posterior_predictive(
trace, var_names=["Y_obs"], samples=1000
)
preds = ppc_grid["Y_obs"]
mean_pred = preds.mean(axis=0)
hpd = az.hdi(preds, hdi_prob=0.94)
# Back‑transform CTR to original scale
ctr_orig = scaler.inverse_transform(grid)[:,2]
plt.figure(figsize=(8,5))
plt.plot(ctr_orig, mean_pred, label="Posterior mean")
plt.fill_between(ctr_orig, hpd[:,0], hpd[:,1], alpha=0.3,
label="94% credible interval")
plt.scatter(
scaler.inverse_transform(X_test_s)[:,2], y_test,
color="k", alpha=0.5, label="Test data"
)
plt.xlabel("Click‑Through Rate (CTR)")
plt.ylabel("Cost‑Per‑Click (USD)")
plt.title("Bayesian Regression: CPC vs. CTR")
plt.legend()
plt.tight_layout()
plt.show()
Summary
This Bayesian Regression workflow for Digital Ad Cost Prediction provides:
1. Point estimates of cost‑per‑click from early campaign metrics.
2. Credible intervals quantifying uncertainty from auction dynamics and campaign variability.
3. Actionable insights: media‑buyers can set bids and budgets with explicit confidence bounds, optimise campaign parameters under uncertainty, and negotiate spend with greater risk awareness.