Factory Output Cost Prediction using Bayesian Regression in ML

We offer you a brighter future with FREE online courses - Start Now!!

Manufacturing directors need to predict per‑unit production cost for a factory given planned output volumes, machine run‑times, raw‑material usage, and labour hours—before finalising production schedules. Costs often vary nonlinearly: per‑unit cost decreases with higher volumes (economies of scale) but can increase if overtime or excess material waste arises. Uncertainty in parameters (e.g., material yield rates) further complicates planning. By applying Bayesian Regression, we obtain not only a point estimate of cost but also credible intervals that reflect parameter uncertainty. This enables more robust budgeting and risk‑aware decision‑making.

Libraries Required

import pandas as pd                               # data handling  
import numpy as np                                # numerical operations  

import matplotlib.pyplot as plt                   # plotting  
import seaborn as sns                             # visualization  

import pymc3 as pm                                # Bayesian modeling  
import arviz as az                                # posterior analysis  

from sklearn.model_selection import train_test_split  
from sklearn.preprocessing import StandardScaler  
from sklearn.metrics import mean_absolute_error

Dataset

Manufacturing Cost

Step-by-Step Code Implementation

Import Libraries & Load Data

import pandas as pd

# Load dataset: columns 'UnitsProduced' and 'TotalCost'
df = pd.read_csv("data/manufacturing_cost.csv")
df.head()

Preprocessing & Train/Test Split

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Features and target
X = df[["UnitsProduced"]].values
y = df["TotalCost"].values

# Train/test split (80/20)
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# Standardize feature for stable MCMC
scaler = StandardScaler().fit(X_train)
X_train_s = scaler.transform(X_train)
X_test_s  = scaler.transform(X_test)

Define & Fit Bayesian Regression Model

Likelihood: Observed costs are modelled as Normal around μ = α + β·X, capturing a linear relationship under standardisation.
MCMC Sampling: We draw 2000 posterior samples (after 1000 tuning) with target_accept=0.9 for stable inference.
Priors (α, β, σ): We choose weakly informative normals for intercept and slope, and a half‑normal for noise scale, reflecting moderate prior uncertainty.

import pymc3 as pm

with pm.Model() as model:
    # Priors for intercept and slope
    α = pm.Normal("α", mu=0, sigma=10)
    β = pm.Normal("β", mu=0, sigma=10)
    σ = pm.HalfNormal("σ", sigma=10)
    
    # Expected cost
    μ = α + β * X_train_s.flatten()
    
    # Likelihood
    Y_obs = pm.Normal("Y_obs", mu=μ, sigma=σ, observed=y_train)
    
    # Sample posterior
    trace = pm.sample(
        draws=2000, tune=1000,
        target_accept=0.9,
        return_inferencedata=True
    )

Posterior Analysis & Prediction

Posterior Predictive: Sampling from the posterior predictive distribution yields cost forecasts and uncertainty bands.
Prediction & MAE: We use posterior means of α and β for point predictions and compute mean absolute error on the hold‑out set.

import arviz as az

# Trace summary
az.summary(trace, round_to=2)

# Posterior predictive sampling
with model:
    ppc = pm.sample_posterior_predictive(trace, var_names=["α","β","σ","Y_obs"])

# Compute mean prediction on test set
α_post = ppc["α"].mean()
β_post = ppc["β"].mean()
# Predict on standardized test volumes
y_pred = α_post + β_post * X_test_s.flatten()

# Compute MAE
from sklearn.metrics import mean_absolute_error
mae = mean_absolute_error(y_test, y_pred)
print(f"Test MAE: ${mae:.2f}")

Visualise Predictions with Credible Intervals

We plot the observed test data, the posterior mean line, and the 94% highest posterior density credible interval to display both the prediction and the uncertainty.

# Generate grid of standardized volumes
X_grid_s = np.linspace(X_test_s.min(), X_test_s.max(), 100)
# Draw posterior samples for predictions
pred_samples = (
    ppc["α"][:, None]
    + ppc["β"][:, None] * X_grid_s[None, :]
)

# Compute mean and 94% credible interval
pred_mean = pred_samples.mean(axis=0)
hpd_bounds = az.hdi(pred_samples, hdi_prob=0.94)

import matplotlib.pyplot as plt

# Transform back to original scale
X_grid = scaler.inverse_transform(X_grid_s.reshape(-1,1)).flatten()

plt.figure(figsize=(8,5))
plt.scatter(X_test, y_test, color="k", alpha=0.5, label="Test data")
plt.plot(X_grid, pred_mean, color="blue", label="Posterior mean")
plt.fill_between(
    X_grid,
    hpd_bounds[:,0],
    hpd_bounds[:,1],
    color="blue", alpha=0.3,
    label="94% Credible interval"
)
plt.xlabel("Units Produced")
plt.ylabel("Total Cost (USD)")
plt.title("Factory Output Cost Prediction with Bayesian Regression")
plt.legend()
plt.show()

 Summary

This Bayesian Regression approach:

1. Captures parameter uncertainty, providing credible intervals around cost forecasts, not just point estimates.

2. Handles limited data gracefully, thanks to priors that regularise slope and intercept

3. Delivers actionable insights: managers can see not only the expected cost given the planned output but also the range of plausible costs, informing risk‑aware budgeting and contingency planning.

Did you know we work 24x7 to provide you best tutorials
Please encourage us - write a review on Google | Facebook

Factory Output Cost Prediction using Bayesian Regression in ML

Libraries Required

Dataset

Step-by-Step Code Implementation

Import Libraries & Load Data

Preprocessing & Train/Test Split

Define & Fit Bayesian Regression Model

Posterior Analysis & Prediction

Visualise Predictions with Credible Intervals

Summary

Leave a Reply Cancel reply

Libraries Required

Step-by-Step Code Implementation

Import Libraries & Load Data

Preprocessing & Train/Test Split

 Summary