Consumer Purchase Cost Prediction using Bayesian Regression in ML

We offer you a brighter future with FREE online courses - Start Now!!

Retail analysts and pricing teams need to predict a consumer’s purchase amount—before transaction completion—using early inputs such as customer demographics (age, gender), historical spending, promotional status, and basket size. Purchase amounts exhibit nonlinear effects (e.g., diminishing returns on discount depth, threshold effects of basket variety) and uncertainty due to individual behaviour variability. A classic point‐estimate regression masks this uncertainty, risking mis‐targeted promotions or inventory misallocation. By applying Bayesian Regression in ML, we obtain both a point estimate of purchase cost and credible intervals that quantify our uncertainty—enabling risk‐aware pricing and personalised offer strategies.

Libraries Required

import pandas as pd                              # data loading & manipulation  
import numpy as np                               # numerical operations  

import matplotlib.pyplot as plt                  # plotting  
import seaborn as sns                            # enhanced visualization  

import pymc3 as pm                               # Bayesian modeling  
import arviz as az                               # posterior analysis  

from sklearn.model_selection import train_test_split  
from sklearn.preprocessing import StandardScaler  
from sklearn.metrics import mean_absolute_error 

Dataset

Customer Purchase Data

Step-by-Step Code Implementation

Import Libraries & Load Data

import pandas as pd

# Load dataset
df = pd.read_csv("data/customer-purchase-data/CustomerPurchaseData.csv")

# Preview relevant columns
df.head()[[
    'Age','Gender','Annual_Income','Num_Purchases','Purchase_Amount'
]]

Preprocessing & Train/Test Split

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Encode gender
df['Gender_Code'] = df['Gender'].map({'Male':0,'Female':1})

# Define features and target
X = df[['Age','Gender_Code','Annual_Income','Num_Purchases']].values
y = df['Purchase_Amount'].values  # in USD

# Split (80% train / 20% test)
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# Standardize numeric features for MCMC stability
scaler = StandardScaler().fit(X_train)
X_train_s = scaler.transform(X_train)
X_test_s  = scaler.transform(X_test)

Define & Fit Bayesian Regression Model

Priors:

  • α ∼ Normal(0, 100): broad intercept prior.
  • βᵢ ∼ Normal(0, 50): moderate prior uncertainty for each standardised predictor.
  • σ ∼ HalfNormal(50): residual noise scale.

Model:

  • The linear predictor μ = α + β·X_standardized links demographics and behaviour to the purchase amount.
  • Observations y_train ∼ Normal(μ, σ).

Sampling:

  • We draw 2,000 posterior samples (post-burn-in of 1,000) with target_accept=0.9 to ensure robust convergence.
  • Posterior predictive sampling yields complete predictive distributions.
import pymc3 as pm

with pm.Model() as purchase_model:
    # Priors
    α = pm.Normal("α", mu=0, sigma=100)
    β = pm.Normal("β", mu=0, sigma=50, shape=X_train_s.shape[1])
    σ = pm.HalfNormal("σ", sigma=50)
    
    # Linear predictor
    μ = α + pm.math.dot(X_train_s, β)
    
    # Likelihood
    Y_obs = pm.Normal("Y_obs", mu=μ, sigma=σ, observed=y_train)
    
    # Sample the posterior
    trace = pm.sample(
        draws=2000, tune=1000,
        target_accept=0.9,
        return_inferencedata=True
    )

Posterior Analysis & Point Predictions

Posterior means of α and β give point forecasts; MAE on held‑out data quantifies average error.

import arviz as az
from sklearn.metrics import mean_absolute_error

# Summarize posterior
az.summary(trace, round_to=2)

# Posterior predictive sampling
with purchase_model:
    ppc = pm.sample_posterior_predictive(trace, var_names=["Y_obs"])

# Posterior means
α_post = trace.posterior["α"].mean().item()
β_post = trace.posterior["β"].mean(dim=["chain","draw"]).values

# Point predictions on test set
y_pred = α_post + X_test_s.dot(β_post)

# Evaluate MAE
mae = mean_absolute_error(y_test, y_pred)
print(f"Test MAE: ${mae:.2f}")

Visualise Predictions & Credible Intervals

Sweeping one feature (number of purchases) while holding others fixed, we plot the posterior mean curve and 94% credible bands—illustrating both central tendency and uncertainty.

# Vary Num_Purchases; fix others at median
import numpy as np
import matplotlib.pyplot as plt

num_grid = np.linspace(X_train_s[:,3].min(), X_train_s[:,3].max(), 100)
grid = np.tile(np.median(X_train_s, axis=0), (100,1))
grid[:,3] = num_grid

with purchase_model:
    pm.set_data({"X": grid})
    ppc_grid = pm.sample_posterior_predictive(trace, var_names=["Y_obs"])

preds     = ppc_grid["Y_obs"]
mean_pred = preds.mean(axis=0)
hpd       = az.hdi(preds, hdi_prob=0.94)

# Convert Num_Purchases back
num_orig = scaler.inverse_transform(
    np.column_stack([grid[:,0],grid[:,1],grid[:,2],grid[:,3]])
)[:,3]

plt.figure(figsize=(8,5))
plt.plot(num_orig, mean_pred, label="Posterior mean")
plt.fill_between(num_orig, hpd[:,0], hpd[:,1], alpha=0.3,
                 label="94% CI")
plt.scatter(scaler.inverse_transform(X_test_s)[:,3], y_test,
            color="k", alpha=0.5, label="Test data")
plt.xlabel("Number of Purchases")
plt.ylabel("Purchase Amount (USD)")
plt.title("Bayesian Regression: Purchase Amount vs. Num_Purchases")
plt.legend()
plt.show()

Summary

This Bayesian Regression framework for consumer purchase‐amount forecasting provides:

1. Point estimates of expected spend from early customer indicators.

2. Credible intervals quantifying uncertainty from behavioural variability.

3. Actionable insights: marketing and sales teams can use both expected purchase amounts and uncertainty bounds to tailor promotions, adjust loyalty rewards, and optimise inventory under uncertainty.

If you are Happy with ProjectGurukul, do not forget to make us happy with your positive feedback on Google | Facebook

ProjectGurukul Team

The ProjectGurukul Team delivers project-based tutorials on programming, machine learning, and web development. We simplify learning by providing hands-on projects to help you master real-world skills. We also provide free major and minor projects for enginering students.

Leave a Reply

Your email address will not be published. Required fields are marked *