Food Prep Cost Prediction using Bayesian Regression in ML

FREE Online Courses: Your Passport to Excellence - Start Now

Chefs, kitchen managers, and catering directors need to forecast the ingredient + labour cost of each menu item before finalising recipes and pricing. Key drivers—ingredient unit cost, portion size, prep time, and recipe complexity—combine in nonlinear ways: buying ingredients in bulk reduces per‑unit cost; longer cook times may trigger higher‑skill (and higher‑wage) labour bands. Additionally, ingredient prices fluctuate, introducing uncertainty. By applying Bayesian Regression, we not only predict the best‑estimate prep cost per dish but also derive credible intervals that capture our uncertainty—enabling data‑driven margin setting and dynamic menu adjustments.

Libraries Required

import pandas as pd               # data loading & manipulation  
import numpy as np                # numerical operations  

import matplotlib.pyplot as plt   # plotting  
import seaborn as sns             # visualization  

import pymc3 as pm                # Bayesian modeling  
import arviz as az                # posterior analysis  

from sklearn.model_selection import train_test_split  
from sklearn.preprocessing import StandardScaler  
from sklearn.metrics import mean_absolute_error  

Dataset

Restaurant Cost and Sales Dataset

Step-by-Step Code Implementation

Data Loading & Preprocessing

  • We one‑hot encode Category and Sub Category, so recipe type influences cost.
  • We z‑score Price for stable MCMC convergence.
import pandas as pd

# Load dataset, keep only relevant fields, drop missing
df = pd.read_csv("data/Cost_Sales.csv")
df = df[['Category','Sub Category','Price','Cost']].dropna()

# One‑hot encode dish categories
df = pd.get_dummies(df, columns=['Category','Sub Category'], drop_first=True)

# Define features (Price + dummies) and target (actual prep cost)
X = df.drop(columns='Cost').values
y = df['Cost'].values

# Train/test split (80/20)
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# Standardize the numeric Price column (column 0) for stable MCMC
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler().fit(X_train[:, :1])
X_train_s = X_train.copy()
X_train_s[:, :1] = scaler.transform(X_train[:, :1])
X_test_s  = X_test.copy()
X_test_s[:, :1]  = scaler.transform(X_test[:, :1])

Define & Fit Bayesian Regression Model

Model Specification:

  • α (intercept) and β (coefficients) get weakly informative Normal priors.
  • σ governs residual spread (HalfNormal prior).
  • The likelihood ties observed prep cost to a Normal distribution centred on the linear predictor.

Sampling: We draw 2,000 posterior samples (after 1,000 tuning) with target_accept=0.9 to ensure robust convergence diagnostics.

import pymc3 as pm

with pm.Model() as food_cost_model:
    # Priors
    α = pm.Normal("α", mu=0, sigma=10)                     # intercept prior
    β = pm.Normal("β", mu=0, sigma=5, shape=X_train_s.shape[1])  
    σ = pm.HalfNormal("σ", sigma=5)                         # noise scale

    # Expected cost linear predictor
    μ = α + pm.math.dot(X_train_s, β)

    # Likelihood: observed prep cost
    Y_obs = pm.Normal("Y_obs", mu=μ, sigma=σ, observed=y_train)

    # Sample posterior
    trace = pm.sample(
        draws=2000,       # number of posterior samples
        tune=1000,        # burn‑in
        target_accept=0.9,
        return_inferencedata=True
    )

Posterior Analysis & Point Predictions

Posterior Predictive: We generate predicted cost distributions for new inputs, enabling credible intervals around each forecast.

Evaluation: We compute the Mean Absolute Error (MAE) on held‑out test data to quantify point‑forecast accuracy.

import arviz as az
from sklearn.metrics import mean_absolute_error

# Summarize parameter posteriors
az.summary(trace, round_to=2)

# Posterior predictive sampling
with food_cost_model:
    ppc = pm.sample_posterior_predictive(trace, var_names=["Y_obs"])

# Extract posterior means
α_post = trace.posterior["α"].mean().item()
β_post = trace.posterior["β"].mean(dim=["chain","draw"]).values

# Compute point predictions
y_pred = α_post + X_test_s.dot(β_post)

# Evaluate accuracy
mae = mean_absolute_error(y_test, y_pred)
print(f"Test MAE: ${mae:.2f}")

Visualise Predictions & Credible Intervals

By sweeping Price and holding dish attributes at typical values, we plot both the posterior mean prep cost curve and its 94% Highest Posterior Density interval—revealing expected margins and pricing uncertainty.

import numpy as np
import matplotlib.pyplot as plt

# Generate a grid of Price values
price_grid = np.linspace(X_train_s[:,0].min(), X_train_s[:,0].max(), 100)
grid = np.median(X_train_s, axis=0)[None,:].repeat(100, axis=0)
grid[:,0] = price_grid

# Posterior predictive on grid
with food_cost_model:
    ppc_grid = pm.sample_posterior_predictive(trace, var_names=["Y_obs"], samples=1000)

preds     = ppc_grid["Y_obs"]
mean_pred = preds.mean(axis=0)
hpd       = az.hdi(preds, hdi_prob=0.94)

# Back‑transform Price to original scale
price_orig = scaler.inverse_transform(price_grid.reshape(-1,1)).flatten()

plt.figure(figsize=(8,5))
plt.plot(price_orig, mean_pred, label="Posterior mean")
plt.fill_between(price_orig, hpd[:,0], hpd[:,1], alpha=0.3,
                 label="94% credible interval")
plt.scatter(
    scaler.inverse_transform(X_test_s[:, :1]).flatten(),
    y_test, color="k", alpha=0.5, label="Test data"
)
plt.xlabel("Menu Price (USD)")
plt.ylabel("Preparation Cost (USD)")
plt.title("Bayesian Regression: Prep Cost vs. Price")
plt.legend()
plt.tight_layout()
plt.show()

Summary

This Bayesian Regression workflow for Food Preparation Cost Prediction offers:

1. Accurate point estimates of per‑dish prep cost from early indicators.

2. Credible intervals quantifying uncertainty due to ingredient and labour price volatility.

3. Actionable insights: kitchen managers can set menu prices and manage procurement with full awareness of cost risk, optimising margins under uncertainty.

If you are Happy with ProjectGurukul, do not forget to make us happy with your positive feedback on Google | Facebook

ProjectGurukul Team

The ProjectGurukul Team delivers project-based tutorials on programming, machine learning, and web development. We simplify learning by providing hands-on projects to help you master real-world skills. We also provide free major and minor projects for enginering students.

Leave a Reply

Your email address will not be published. Required fields are marked *