Food Prep Cost Prediction using Bayesian Regression in ML

FREE Online Courses: Dive into Knowledge for Free. Learn More!

Chefs, kitchen managers, and catering directors need to forecast the ingredient + labour cost of each menu item before finalising recipes and pricing. Key drivers—ingredient unit cost, portion size, prep time, and recipe complexity—combine in nonlinear ways: buying ingredients in bulk reduces per‑unit cost; longer cook times may trigger higher‑skill (and higher‑wage) labour bands. Additionally, ingredient prices fluctuate, introducing uncertainty. By applying Bayesian Regression, we not only predict the best‑estimate prep cost per dish but also derive credible intervals that capture our uncertainty—enabling data‑driven margin setting and dynamic menu adjustments.

Libraries Required

import pandas as pd               # data loading & manipulation  
import numpy as np                # numerical operations  

import matplotlib.pyplot as plt   # plotting  
import seaborn as sns             # visualization  

import pymc3 as pm                # Bayesian modeling  
import arviz as az                # posterior analysis  

from sklearn.model_selection import train_test_split  
from sklearn.preprocessing import StandardScaler  
from sklearn.metrics import mean_absolute_error

Dataset

Restaurant Cost and Sales Dataset

Step-by-Step Code Implementation

Data Loading & Preprocessing

We one‑hot encode Category and Sub Category, so recipe type influences cost.
We z‑score Price for stable MCMC convergence.

import pandas as pd

# Load dataset, keep only relevant fields, drop missing
df = pd.read_csv("data/Cost_Sales.csv")
df = df[['Category','Sub Category','Price','Cost']].dropna()

# One‑hot encode dish categories
df = pd.get_dummies(df, columns=['Category','Sub Category'], drop_first=True)

# Define features (Price + dummies) and target (actual prep cost)
X = df.drop(columns='Cost').values
y = df['Cost'].values

# Train/test split (80/20)
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# Standardize the numeric Price column (column 0) for stable MCMC
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler().fit(X_train[:, :1])
X_train_s = X_train.copy()
X_train_s[:, :1] = scaler.transform(X_train[:, :1])
X_test_s  = X_test.copy()
X_test_s[:, :1]  = scaler.transform(X_test[:, :1])

Define & Fit Bayesian Regression Model

Model Specification:

α (intercept) and β (coefficients) get weakly informative Normal priors.
σ governs residual spread (HalfNormal prior).
The likelihood ties observed prep cost to a Normal distribution centred on the linear predictor.

Sampling: We draw 2,000 posterior samples (after 1,000 tuning) with target_accept=0.9 to ensure robust convergence diagnostics.

import pymc3 as pm

with pm.Model() as food_cost_model:
    # Priors
    α = pm.Normal("α", mu=0, sigma=10)                     # intercept prior
    β = pm.Normal("β", mu=0, sigma=5, shape=X_train_s.shape[1])  
    σ = pm.HalfNormal("σ", sigma=5)                         # noise scale

    # Expected cost linear predictor
    μ = α + pm.math.dot(X_train_s, β)

    # Likelihood: observed prep cost
    Y_obs = pm.Normal("Y_obs", mu=μ, sigma=σ, observed=y_train)

    # Sample posterior
    trace = pm.sample(
        draws=2000,       # number of posterior samples
        tune=1000,        # burn‑in
        target_accept=0.9,
        return_inferencedata=True
    )

Posterior Analysis & Point Predictions

Posterior Predictive: We generate predicted cost distributions for new inputs, enabling credible intervals around each forecast.

Evaluation: We compute the Mean Absolute Error (MAE) on held‑out test data to quantify point‑forecast accuracy.

import arviz as az
from sklearn.metrics import mean_absolute_error

# Summarize parameter posteriors
az.summary(trace, round_to=2)

# Posterior predictive sampling
with food_cost_model:
    ppc = pm.sample_posterior_predictive(trace, var_names=["Y_obs"])

# Extract posterior means
α_post = trace.posterior["α"].mean().item()
β_post = trace.posterior["β"].mean(dim=["chain","draw"]).values

# Compute point predictions
y_pred = α_post + X_test_s.dot(β_post)

# Evaluate accuracy
mae = mean_absolute_error(y_test, y_pred)
print(f"Test MAE: ${mae:.2f}")

Visualise Predictions & Credible Intervals

By sweeping Price and holding dish attributes at typical values, we plot both the posterior mean prep cost curve and its 94% Highest Posterior Density interval—revealing expected margins and pricing uncertainty.

import numpy as np
import matplotlib.pyplot as plt

# Generate a grid of Price values
price_grid = np.linspace(X_train_s[:,0].min(), X_train_s[:,0].max(), 100)
grid = np.median(X_train_s, axis=0)[None,:].repeat(100, axis=0)
grid[:,0] = price_grid

# Posterior predictive on grid
with food_cost_model:
    ppc_grid = pm.sample_posterior_predictive(trace, var_names=["Y_obs"], samples=1000)

preds     = ppc_grid["Y_obs"]
mean_pred = preds.mean(axis=0)
hpd       = az.hdi(preds, hdi_prob=0.94)

# Back‑transform Price to original scale
price_orig = scaler.inverse_transform(price_grid.reshape(-1,1)).flatten()

plt.figure(figsize=(8,5))
plt.plot(price_orig, mean_pred, label="Posterior mean")
plt.fill_between(price_orig, hpd[:,0], hpd[:,1], alpha=0.3,
                 label="94% credible interval")
plt.scatter(
    scaler.inverse_transform(X_test_s[:, :1]).flatten(),
    y_test, color="k", alpha=0.5, label="Test data"
)
plt.xlabel("Menu Price (USD)")
plt.ylabel("Preparation Cost (USD)")
plt.title("Bayesian Regression: Prep Cost vs. Price")
plt.legend()
plt.tight_layout()
plt.show()

Summary

This Bayesian Regression workflow for Food Preparation Cost Prediction offers:

1. Accurate point estimates of per‑dish prep cost from early indicators.

2. Credible intervals quantifying uncertainty due to ingredient and labour price volatility.

3. Actionable insights: kitchen managers can set menu prices and manage procurement with full awareness of cost risk, optimising margins under uncertainty.

We work very hard to provide you quality material
Could you take 15 seconds and share your happy experience on Google | Facebook

Food Prep Cost Prediction using Bayesian Regression in ML

Libraries Required

Dataset

Step-by-Step Code Implementation

Data Loading & Preprocessing

Define & Fit Bayesian Regression Model

Posterior Analysis & Point Predictions

Visualise Predictions & Credible Intervals