Tool Rental Cost Trend Prediction with Polynomial Regression in ML

FREE Online Courses: Your Passport to Excellence - Start Now

Rental company analysts need to forecast week‑over‑week changes in average tool‐rental cost (USD/day) for budgeting and dynamic pricing before operational adjustments. Historical rental logs indicate that cost changes depend nonlinearly on prior‐week average cost (momentum/saturation), rental volume (demand pressure), tool category mix (premium vs. standard), and seasonal factors (e.g., holiday spikes). A simple linear model underestimates curvature—such as price plateaus at high demand—while an unregularised high‑degree polynomial overfits noise. By fitting a Polynomial Regression on engineered features with Ridge (ℓ²) regularisation, we can capture smooth cost‐trend dynamics and deliver interpretable, accurate forecasts to guide pricing strategy.

Dataset

Commercial Tool Rental Data

Step-by-Step Code Implementation

1. Libraries Required

import pandas as pd                             # data handling  
import numpy as np                              # numerical ops  

import matplotlib.pyplot as plt                 # plotting  
import seaborn as sns                           # visualization  

from sklearn.model_selection import train_test_split, GridSearchCV  
from sklearn.preprocessing import StandardScaler, PolynomialFeatures  
from sklearn.linear_model import Ridge  
from sklearn.pipeline import Pipeline  
from sklearn.metrics import mean_squared_error, r2_score

2. Load Data & Compute Features

import pandas as pd

# Load weekly rentals (adjust path)
df = pd.read_csv("data/commercial-tool-rental-data-for-2016-and-2017/rentals.csv")

# Compute average cost per day by week
df['rental_date'] = pd.to_datetime(df['rental_date'])
df['week'] = df['rental_date'].dt.to_period('W').apply(lambda r: r.start_time)
weekly = df.groupby('week').agg({
    'daily_cost': 'mean',
    'rental_id': 'count',            # volume
    'tool_category': lambda x: x.mode()[0]
}).rename(columns={'daily_cost':'avg_cost','rental_id':'volume'}).reset_index()

3. Target Engineering & Lag Features

Lag features (cost_prev,volume_prev) capture momentum and demand saturation.
One‑hot encoding of tool_category models category‑specific pricing effects.
PolynomialFeatures generates squared and interaction terms—e.g., cost_prev², cost_prev × volume_prev, volume_prev × tool_category_Premium—to capture curvature and synergy in cost dynamics.

# Sort and lag
weekly = weekly.sort_values('week')
weekly['cost_prev']   = weekly['avg_cost'].shift(1)
weekly['volume_prev'] = weekly['volume'].shift(1)
weekly.dropna(subset=['cost_prev','volume_prev'], inplace=True)

# One‑hot encode category
weekly = pd.get_dummies(weekly, columns=['tool_category'], drop_first=True)

# Compute cost growth target
weekly['cost_growth_pct'] = (weekly['avg_cost'] - weekly['cost_prev']) / weekly['cost_prev'] * 100

# Features & target
feature_cols = ['cost_prev','volume_prev'] + \
               [c for c in weekly.columns if c.startswith('tool_category_')]
X = weekly[feature_cols]
y = weekly['cost_growth_pct']

4. Build Polynomial Regression Pipeline

StandardScaler normalizes inputs so Ridge’s ℓ² penalty treats all polynomial terms equally.
Ridge Regression with alpha controls overfitting from high‑order terms.

from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler, PolynomialFeatures
from sklearn.linear_model import Ridge

pipe = Pipeline([
    ('scale', StandardScaler()),  
    ('poly', PolynomialFeatures(include_bias=False)),  
    ('ridge', Ridge(random_state=42))  
])

5. Train/Test Split & Hyperparameter Search

GridSearchCV tunes polynomial degree (1–3) and regularisation strength α (10⁻³…10³) via 5‑fold CV, optimising for lowest RMSE on held‑out growth predictions.

from sklearn.model_selection import train_test_split, GridSearchCV
import numpy as np

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, shuffle=False, random_state=42
)

param_grid = {
    'poly__degree': [1, 2, 3],
    'ridge__alpha': np.logspace(-3, 3, 7)
}

gs = GridSearchCV(
    pipe, param_grid,
    cv=5,
    scoring='neg_root_mean_squared_error',
    n_jobs=-1, verbose=1
)
gs.fit(X_train, y_train)
print("Best params:", gs.best_params_)

6. Evaluate Model

from sklearn.metrics import mean_squared_error, r2_score

y_pred = gs.predict(X_test)
rmse = mean_squared_error(y_test, y_pred, squared=False)
r2   = r2_score(y_test, y_pred)

print(f"Test RMSE: {rmse:.2f}% growth")
print(f"Test R²  : {r2:.3f}")

7. Inspect Key Polynomial Coefficients

Coefficient inspection surfaces the most influential terms—guiding pricing actions, such as moderating cost during peak‐volume weeks or adjusting premium‐tool surcharges.

poly       = gs.best_estimator_.named_steps['poly']
feat_names = poly.get_feature_names_out(input_features=feature_cols)
coefs      = gs.best_estimator_.named_steps['ridge'].coef_

import pandas as pd
import matplotlib.pyplot as plt

coef_series = pd.Series(coefs, index=feat_names).abs().sort_values(ascending=False)
plt.figure(figsize=(8,5))
coef_series.head(10).plot(kind='barh')
plt.gca().invert_yaxis()
plt.title("Top Polynomial Features Driving Cost Growth")
plt.xlabel("Coefficient Magnitude")
plt.tight_layout()
plt.show()

Summary

This Polynomial Regression pipeline with Ridge regularisation provides:

1. Accurate nonlinear forecasts of weekly tool‐rental cost growth, capturing diminishing‐return and synergy effects (low RMSE, high R²).

2. Balanced complexity, avoiding overfitting through α tuning.

3. Interpretable insights, with clear identification of the polynomial features—like cost_prev² and cost_prev × volume_prev—that drive cost trends, enabling data‑driven dynamic pricing decisions.

You give me 15 seconds I promise you best tutorials
Please share your happy experience on Google | Facebook

Tool Rental Cost Trend Prediction with Polynomial Regression in ML

Dataset

Step-by-Step Code Implementation

1. Libraries Required

2. Load Data & Compute Features

3. Target Engineering & Lag Features

4. Build Polynomial Regression Pipeline

5. Train/Test Split & Hyperparameter Search

6. Evaluate Model

7. Inspect Key Polynomial Coefficients