Customer Retention Cost Prediction with Lasso Regression in ML

FREE Online Courses: Click for Success, Learn for Free - Start Now!

Subscription businesses spend heavily on offers—such as cashbacks, plan discounts, and loyalty points—to prevent customers from churning. Yet, most teams cannot quantify exactly how much a new retention campaign should cost for a specific account. This project builds a Lasso‑regularised linear model that:

  • Predicts the minimum incentive cost (USD) likely required to persuade a subscriber to stay, using their service usage, tenure, and payment behaviour.
  • Identifies the small set of customer traits that truly drive retention spending, because Lasso’s ℓ1 penalty shrinks uninformative coefficients to zero.

The target variable (retention_cost) will be engineered from monthly charges, tenure, and churn propensity, yielding a dollar estimate that marketing can act upon.

Libraries Required

Purpose Library
Data handling pandas, numpy
Visualisation matplotlib, seaborn
ML pipeline scikit‑learnLasso, Pipeline, ColumnTransformer, StandardScaler, OneHotEncoder, GridSearchCV
Evaluation mean_squared_error, r2_score

Dataset Link

Telco Customer Churn

Step-by-Step Code Implementation

1. Import Libraries

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder, StandardScaler
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.pipeline import Pipeline
from sklearn.linear_model import Lasso
from sklearn.metrics import mean_squared_error, r2_score

2.  Download and load the dataset

The dataset combines demographic, contract, usage, and billing data for 7,043 telecom subscribers.

# One‑time download (requires Kaggle API):
# kaggle datasets download -d blastchar/telco-customer-churn -p data --unzip

data = pd.read_csv("data/Telco-Customer-Churn.csv")   # 7 043 rows, 21 columns

3. Create a “retention cost” target

Assumption: Retaining an at-risk customer usually requires ~20% of their monthly bill for each remaining month of a standard 5-year lifetime (60 months).

AVG_LIFETIME = 60                     # months
INCENTIVE_RATE = 0.20                 # 20 % of monthly charges

data['remaining_months'] = AVG_LIFETIME - data['tenure']
data['remaining_months'] = data['remaining_months'].clip(lower=0)

data['retention_cost'] = data['MonthlyCharges'] * INCENTIVE_RATE * data['remaining_months']

4.  Define features and target

We translate churn risk into a dollar amount—20% of the monthly bill for every month remaining in an assumed five-year relationship. The formula is adjustable for different industries.

y = data['retention_cost']
X = data.drop(columns=['retention_cost', 'customerID'])   # drop identifier

5. Pre‑processing recipe

One‑hot encoding converts categorical variables (e.g., Contract, PaymentMethod) to numeric dummies; numeric columns (e.g., MonthlyCharges, tenure) are z‑scaled so Lasso’s penalty treats them equally.

cat_cols = X.select_dtypes('object').columns
num_cols = X.select_dtypes(exclude='object').columns

preprocess = ColumnTransformer([
    ('cat', OneHotEncoder(drop='first', sparse=False), cat_cols),
    ('num', StandardScaler(), num_cols)
])

6.  Train/test split

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=data['Churn'])

7.  Build & tune Lasso pipeline

A log‑spaced α grid balances sparsity and fit. Five‑fold CV mitigates variance in the modest dataset.

pipe = Pipeline([
    ('prep', preprocess),
    ('model', Lasso(max_iter=10_000, random_state=42))
])

param_grid = {'model__alpha': np.logspace(-3, 1, 30)}   # 0.001 → 10
search = GridSearchCV(pipe, param_grid, cv=5,
                      scoring='neg_root_mean_squared_error', n_jobs=-1)
search.fit(X_train, y_train)

print("Optimal α:", search.best_params_['model__alpha'])

8.  Evaluate on hold‑out set

RMSE expresses the average incentive‑cost error in dollars, while R2R^2 shows variance explained.

y_pred = search.predict(X_test)
rmse = mean_squared_error(y_test, y_pred, squared=False)
r2   = r2_score(y_test, y_pred)
print(f"Test RMSE: ${rmse:,.0f} | R²: {r2:.3f}")

9.  Interpret coefficients

Non-zero coefficients reveal levers, such as month-to-month contracts (often high cost) versus two-year contracts (low cost). Zeroed features can be dropped from future data collection to save ETL effort.

# Retrieve one‑hot column names
ohe = search.best_estimator_.named_steps['prep'].named_transformers_['cat']
ohe_names = ohe.get_feature_names_out(cat_cols)
feature_names = np.hstack([ohe_names, num_cols])

coefs = search.best_estimator_.named_steps['model'].coef_
importance = (pd.Series(coefs, index=feature_names)
                .sort_values(key=abs, ascending=False))

plt.figure(figsize=(9,6))
importance.head(20).plot(kind='barh')
plt.title('Top Drivers of Retention Cost (Lasso Coefficients)')
plt.gca().invert_yaxis()
plt.xlabel('Coefficient (USD change)')
plt.show()

Summary

This Lasso-based pipeline converts raw churn data into a per-customer dollar estimate of retention spending and a ranked list of cost drivers. Marketing teams can rerun the notebook quarterly with fresh records, tweak the incentive formula, and immediately identify which customer segments require the highest budget—guiding more brilliant, data-backed retention campaigns.

Did you like this article? If Yes, please give ProjectGurukul 5 Stars on Google | Facebook

ProjectGurukul Team

ProjectGurukul Team specializes in creating project-based learning resources for programming, Java, Python, Android, AI, Webdevelopment and machine learning. Our mission is to help learners build practical skills through engaging, hands-on projects. We also offer free major and minor projects with source code for engineering students

Leave a Reply

Your email address will not be published. Required fields are marked *