Customer Acquisition Cost Prediction with Lasso Regression in ML
FREE Online Courses: Dive into Knowledge for Free. Learn More!
Growth teams spend aggressively on digital ads but often learn the actual customer‑acquisition cost only after the campaign ends. Our goal is to build a Lasso-regularised linear model that can predict CAC (USD per converting customer) before launching a Facebook ad placement, using features known at the planning stage (audience age band, gender, interest category, impressions, clicks, etc.).
By shrinking weak predictors to zero, Lasso naturally highlights the variables that matter most—allowing marketers to reallocate budget toward high‑efficiency segments.
Libraries Required
| Purpose | Python Package |
| Data loading & wrangling | pandas, numpy |
| Visualisation | matplotlib, seaborn |
| Machine‑learning pipeline | scikit‑learn — ColumnTransformer, OneHotEncoder, StandardScaler, Pipeline, Lasso, GridSearchCV |
| Evaluation metrics | mean_squared_error, r2_score |
Dataset Link
Step-by-Step Code Implementation
1. Import Libraries
import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns from sklearn.compose import ColumnTransformer from sklearn.preprocessing import OneHotEncoder, StandardScaler from sklearn.model_selection import train_test_split, GridSearchCV from sklearn.pipeline import Pipeline from sklearn.linear_model import Lasso from sklearn.metrics import mean_squared_error, r2_score
2. Download and load the dataset
Dataset — logs of Facebook ads: impressions, clicks, spend, audience attributes, and Approved_Conversion counts.
# one‑time shell command (requires Kaggle API & key):
# kaggle datasets download -d madislemsalu/facebook-ad-campaign -p data --unzip
ads = pd.read_csv("data/facebook_ad.csv") # file name may differ; inspect the zip
3. Target engineering — CAC
Spend ÷ approved conversions yields an intuitive dollar figure all marketers recognise. Rows with zero conversions are discarded to avoid infinite values.
# Keep rows with at least one conversion to avoid division by zero ads = ads[ads['Approved_Conversion'] > 0] # Customer Acquisition Cost (USD per conversion) ads['CAC'] = ads['Spent'] / ads['Approved_Conversion'] y = ads['CAC']
4. Feature matrix (X)
# Drop leakage columns and identifiers X = ads.drop(columns=['CAC', 'Approved_Conversion', 'ad_id']) # Quick peek print(X.head())
5. Pre‑processing recipe
One-hot encoding converts categorical predictors (age‑band, gender, interest) into dummy variables; numeric columns are z‑scaled so Lasso’s penalty treats every feature equally.
cat_cols = X.select_dtypes('object').columns # Age, Gender, etc.
num_cols = X.select_dtypes(exclude='object').columns # Impressions, Clicks, Spent …
preprocess = ColumnTransformer([
('cat', OneHotEncoder(drop='first', sparse=False), cat_cols),
('num', StandardScaler(), num_cols)
])
6. Train/test split
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42, stratify=ads['campaign_id'])
7. Build & tune Lasso pipeline
Wrapping preprocessing and modelling in a single pipeline prevents data leakage. A log‑spaced α search (0.001–10) balances sparsity against fitted error; five‑fold CV stabilises the hyper‑parameter choice.
pipe = Pipeline([
('prep', preprocess),
('model', Lasso(max_iter=10_000, random_state=42))
])
param_grid = {'model__alpha': np.logspace(-3, 1, 30)} # 0.001 → 10
search = GridSearchCV(pipe,
param_grid,
cv=5,
scoring='neg_root_mean_squared_error',
n_jobs=-1)
search.fit(X_train, y_train)
print("Optimal α:", search.best_params_['model__alpha'])
8. Evaluate on the hold‑out set
RMSE reports average CAC prediction error (in dollars), while R2R^2 shows variance explained.
y_pred = search.predict(X_test)
rmse = mean_squared_error(y_test, y_pred, squared=False)
r2 = r2_score(y_test, y_pred)
print(f"Test RMSE: ${rmse:,.2f} per customer | R²: {r2:.3f}")
9. Interpret feature importance
Non-zero Lasso coefficients indicate the levers that most significantly influence acquisition cost—perhaps “female 25‑34 in interest cluster 7” or high impression volume with low CTR. Zeroed coefficients reveal negligible drivers, simplifying future data collection.
# Retrieve feature names after one‑hot encoding
ohe = search.best_estimator_.named_steps['prep'].named_transformers_['cat']
ohe_names = ohe.get_feature_names_out(cat_cols)
feature_names = np.hstack([ohe_names, num_cols])
coef = search.best_estimator_.named_steps['model'].coef_
importance = (pd.Series(coef, index=feature_names)
.sort_values(key=abs, ascending=False))
plt.figure(figsize=(9,6))
importance.head(20).plot(kind='barh')
plt.gca().invert_yaxis()
plt.title('Top Drivers of CAC (Lasso Coefficients)')
plt.xlabel('Coefficient (Δ USD per customer)')
plt.show()
Summary
In roughly 120 lines of Python, we created an interpretable, cross‑validated pipeline that:
- Forecasts CAC before spending a cent, letting growth teams budget with confidence.
- Ranks cost drivers, providing a clear roadmap for lowering acquisition expenses.
- Automates updates—because the entire workflow sits inside a Pipeline, weekly retraining with fresh campaign logs is a straightforward fit() call.
Deploying this Lasso‑based tool moves marketing decisions from hindsight to foresight, unlocking smarter, data‑driven campaign planning.