Retail Price Impact Prediction using Bayesian Regression in ML

FREE Online Courses: Elevate Skills, Zero Cost. Enroll Now!

Retail merchandisers and pricing analysts need to quantify the sensitivity of unit sales to changes in retail price—before implementing markdowns or list‑price adjustments—using features such as current price per unit, promotion flag, store traffic index, and competitor price. Demand curves often exhibit nonlinear elasticity (e.g., steep drop‑offs beyond certain price thresholds) and forecast uncertainty from unobserved shopper behaviour. By applying Bayesian Regression, we obtain both a point estimate of the price impact coefficient and a credible interval that captures our uncertainty—enabling data‑driven pricing decisions with explicit risk bounds.

Libraries Required

import pandas as pd                               # data loading & handling  
import numpy as np                                # numerical operations  

import matplotlib.pyplot as plt                   # plotting  
import seaborn as sns                             # enhanced visualization  

import pymc3 as pm                                # Bayesian modeling  
import arviz as az                                # posterior analysis  

from sklearn.model_selection import train_test_split  
from sklearn.preprocessing import StandardScaler  
from sklearn.metrics import mean_absolute_error

Step-by-Step Code Implementation

Import Libraries & Load Data

import pandas as pd

# Load supermarket sales data
df = pd.read_csv("data/Supermarket Sales.csv")

# Preview relevant columns
df = df.rename(columns={
    'Unit price': 'unit_price',
    'Quantity':   'quantity_sold',
    'Total':      'total_revenue',
    'Customer type':'cust_type',
    'City':'store'
})
df[['unit_price','quantity_sold','total_revenue','cust_type','store','Date']].head()

Feature Engineering & Train/Test Split

We model log(quantity) vs log(price) so that β₀ directly represents elasticity (per cent change in units per percent price change).

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Compute log‐sales and log‐price for elasticity
df['log_qty']   = np.log(df['quantity_sold'] + 1)
df['log_price'] = np.log(df['unit_price'])

# Promotion flag if unit_price < median price for that item
median_price = df['unit_price'].median()
df['promo'] = (df['unit_price'] < median_price).astype(int)

# Traffic index proxy: day‐of‐week average footfall (here: weekday vs. weekend)
df['weekday'] = pd.to_datetime(df['Date']).dt.weekday
df['weekend_flag'] = (df['weekday'] >= 5).astype(int)

# Select predictors & target
features = ['log_price','promo','weekend_flag']
X = df[features].values
y = df['log_qty'].values  # log‐quantity = intercept + β·log_price + ...

# Train/test split (80/20 random)
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# Standardize non‐log features for sampling stability
scaler = StandardScaler().fit(X_train[:,1:])
X_train_s = X_train.copy()
X_train_s[:,1:] = scaler.transform(X_train[:,1:])
X_test_s = X_test.copy()
X_test_s[:,1:]  = scaler.transform(X_test[:,1:])

Define & Fit Bayesian Regression Model

Priors (α, β, σ): We choose weakly informative Normal(0,1) priors, expressing initial uncertainty but centering around zero.

Regression model:

μ = α + β₀·log_price + β₁·promo + β₂·weekend_flag.
Observations: log_qty ∼ Normal(μ, σ).

MCMC sampling: 2,000 posterior draws (after 1,000 tuning) with target_accept=0.9 ensure robust convergence.

import pymc3 as pm

with pm.Model() as price_elasticity_model:
    # Priors
    α = pm.Normal("α", mu=0, sigma=1)                         # intercept for log‐sales
    β = pm.Normal("β", mu=0, sigma=1, shape=X_train_s.shape[1])  
       # β[0]: price elasticity; β[1]: promo; β[2]: weekend
    σ = pm.HalfNormal("σ", sigma=1)                           # residual noise

    # Linear predictor
    μ = α + pm.math.dot(X_train_s, β)

    # Likelihood
    Y_obs = pm.Normal("Y_obs", mu=μ, sigma=σ, observed=y_train)

    # Sample posterior
    trace = pm.sample(
        draws=2000,
        tune=1000,
        target_accept=0.9,
        return_inferencedata=True
    )

Posterior Analysis & Point Estimates

Posterior analysis: We extract the mean and 94% Credible Interval for β₀, our primary metric of price sensitivity.
Prediction: Posterior predictive sampling yields in‐sample MAE for log‐quantity; the same pipeline can forecast on held‐out data.

import arviz as az
from sklearn.metrics import mean_absolute_error

# Summarize posterior
az.summary(trace, round_to=2)

# Extract posterior mean of price elasticity β[0]
elasticity_mean = trace.posterior['β'].sel(beta_dim_0=0).mean().item()
elasticity_hpd = az.hdi(trace.posterior['β'].sel(beta_dim_0=0), hdi_prob=0.94)

print(f"Estimated Price Elasticity (β₀): {elasticity_mean:.2f}")
print(f"94% Credible Interval: [{elasticity_hpd.sel(hdi='lower'):.2f}, {elasticity_hpd.sel(hdi='higher'):.2f}]")

# Posterior predictive & MAE
with price_elasticity_model:
    ppc = pm.sample_posterior_predictive(trace, var_names=["Y_obs"])

y_pred = ppc['Y_obs'].mean(axis=0)
print("Train MAE (log‐qty):", mean_absolute_error(y_train, y_pred))

Visualise Price Elasticity Posterior

The kernel density plot of β₀’s posterior shows both central tendency and uncertainty, guiding confidence in pricing decisions.

import seaborn as sns
import matplotlib.pyplot as plt

# Plot posterior distribution of β₀ (price elasticity)
elasticity_samples = trace.posterior['β'].sel(beta_dim_0=0).values.flatten()
sns.kdeplot(elasticity_samples, fill=True)
plt.axvline(elasticity_mean, color='k', linestyle='--', label='Mean')
plt.axvline(elasticity_hpd.sel(hdi='lower'), color='red', linestyle=':', label='94% CI')
plt.axvline(elasticity_hpd.sel(hdi='higher'), color='red', linestyle=':')
plt.title("Posterior of Price Elasticity (β₀)")
plt.xlabel("Elasticity")
plt.legend()
plt.tight_layout()
plt.show()

Summary

Using Bayesian Regression to model retail price impact provides:

1. Direct elasticity estimate (β₀) with a credible interval, quantifying how sensitive demand is to price changes.

2. Uncertainty quantification in both elasticity and demand forecasts—crucial for risk‑aware markdown strategies.

3. Actionable insights: Merchandisers can set prices knowing both expected sales lift/drop and the confidence bounds, optimising revenue under uncertainty.

Your opinion matters
Please write your valuable feedback about ProjectGurukul on Google | Facebook

Retail Price Impact Prediction using Bayesian Regression in ML

Libraries Required

Step-by-Step Code Implementation

Import Libraries & Load Data

Feature Engineering & Train/Test Split