Retail Sales Cost Prediction using Bayesian Regression in ML
FREE Online Courses: Elevate Your Skills, Zero Cost Attached - Enroll Now!
Retail financial planners need to predict the total cost of goods sold (COGS) for an upcoming sales period—before the period begins—using early indicators such as historical sales volume, average unit price, promotional discount rates, and seasonality flags. COGS often exhibit nonlinear scale effects (bulk purchasing discounts) and uncertainty from supply chain variability (price fluctuations, freight costs). A traditional point‐estimate regression yields a single forecast but hides uncertainty. Bayesian Regression helps obtain both point estimates and credible intervals for COGS. This empowers procurement teams to budget with risk awareness and negotiate supplier contracts more effectively.
Libraries Required
import pandas as pd # data loading & manipulation import numpy as np # numerical operations import matplotlib.pyplot as plt # plotting import seaborn as sns # enhanced visualization import pymc3 as pm # Bayesian modeling import arviz as az # posterior analysis from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler from sklearn.metrics import mean_absolute_error
Dataset
Step-by-Step Code Implementation
Import Libraries & Load Data
import pandas as pd
# Load the retail transactions
df = pd.read_csv("data/retail-transactions-dataset.csv")
# Keep only fields we need
df = df[['Quantity','UnitPrice','TotalCost','DayOfWeek','PromotionFlag']]
df.head()
Preprocessing & Train/Test Split
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
# Define features and target
# TotalCost is our COGS for that transaction
X = df[['Quantity','UnitPrice','PromotionFlag','DayOfWeek']].copy()
# One‐hot encode day of week
X = pd.get_dummies(X, columns=['DayOfWeek'], drop_first=True)
y = df['TotalCost'].values
# Split chronologically if timestamp exists; otherwise random
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)
# Standardize numeric features for MCMC stability
num_cols = ['Quantity','UnitPrice','PromotionFlag']
scaler = StandardScaler().fit(X_train[num_cols])
X_train_s = X_train.copy()
X_train_s[num_cols] = scaler.transform(X_train[num_cols])
X_test_s = X_test.copy()
X_test_s[num_cols] = scaler.transform(X_test[num_cols])
Define & Fit Bayesian Regression Model
- Priors (α, β, σ): Weakly informative normals anchor intercept and coefficients, and a half‑normal on noise scale reflects moderate uncertainty.
- Linear predictor: μ = α + β·X_standardized models the total cost as a linear function of standardised features.
- Likelihood: The observed transaction cost is normally distributed around μ with standard deviation σ.
- MCMC sampling: We draw 2,000 posterior samples (after 1,000 tuning) with target_accept=0.9 for stable inference.
import pymc3 as pm
with pm.Model() as retail_model:
# Priors
α = pm.Normal("α", mu=0, sigma=20)
β = pm.Normal("β", mu=0, sigma=10, shape=X_train_s.shape[1])
σ = pm.HalfNormal("σ", sigma=10)
# Linear predictor
μ = α + pm.math.dot(X_train_s.values, β)
# Likelihood
Y_obs = pm.Normal("Y_obs", mu=μ, sigma=σ, observed=y_train)
# MCMC sampling
trace = pm.sample(
draws=2000, tune=1000,
target_accept=0.9,
return_inferencedata=True
)
Posterior Analysis & Point Predictions
- Posterior predictive: Sampling from the posterior predictive distribution yields both point forecasts and credible intervals for COGS.
- Point predictions: We use posterior means of α and β to compute mean forecasts and evaluate MAE on held‑out transactions.
import arviz as az
# Summarize posterior
az.summary(trace, round_to=2)
# Posterior predictive sampling
with retail_model:
ppc = pm.sample_posterior_predictive(trace, var_names=["Y_obs"])
# Compute posterior means
α_post = trace.posterior["α"].mean().item()
β_post = trace.posterior["β"].mean(dim=["chain","draw"]).values
# Point predictions on test set
y_pred = α_post + X_test_s.values.dot(β_post)
# Evaluate MAE
from sklearn.metrics import mean_absolute_error
mae = mean_absolute_error(y_test, y_pred)
print(f"Test MAE: ${mae:.2f}")
Visualise Predictions & Credible Intervals
Plotting cost vs. quantity with a posterior mean line and 94% credible bands illustrates both the expected cost behaviour and the associated uncertainty.
# Example: vary Quantity while holding others at median
qty_grid = np.linspace(X_train_s['Quantity'].min(), X_train_s['Quantity'].max(), 100)
median_vals = X_train_s.median()
grid = pd.DataFrame({
'Quantity': qty_grid,
'UnitPrice': median_vals['UnitPrice'],
'PromotionFlag': median_vals['PromotionFlag'],
**{col: median_vals[col] for col in X_train_s.columns if col.startswith('DayOfWeek_')}
})
with retail_model:
pm.set_data({"X": grid.values})
ppc_grid = pm.sample_posterior_predictive(trace, var_names=["Y_obs"])
preds = ppc_grid["Y_obs"]
mean_pred = preds.mean(axis=0)
hpd = az.hdi(preds, hdi_prob=0.94)
# Transform Quantity back
qty_orig = scaler.inverse_transform(
np.column_stack([qty_grid, grid.drop(columns='Quantity').values])
)[:,0]
plt.figure(figsize=(8,5))
plt.plot(qty_orig, mean_pred, label="Posterior mean")
plt.fill_between(qty_orig, hpd[:,0], hpd[:,1], alpha=0.3, label="94% CI")
plt.scatter(scaler.inverse_transform(X_test_s[['Quantity']]), y_test,
color='k', alpha=0.5, label="Test data")
plt.xlabel("Quantity Sold")
plt.ylabel("Total Cost (USD)")
plt.title("Bayesian Regression: COGS vs. Quantity")
plt.legend()
plt.show()
Summary
This Bayesian Regression approach for retail COGS forecasting provides:
1. Point estimates of per‑transaction cost based on early sales indicators.
2. Credible intervals quantifying uncertainty from supply‑chain and demand variability.
3. Actionable insights: procurement and finance teams gain both expected COGS and its uncertainty bounds—enabling risk‑aware budgeting, dynamic supplier negotiations, and contingency planning.