Marketing Spend Efficiency Prediction with Lasso Regression in ML
FREE Online Courses: Your Passport to Excellence - Start Now
CMOs operate under constant pressure to extract more revenue from every marketing dollar. While dashboards show how much is being spent per channel, they seldom forecast how efficiently that money will convert to sales before the campaign goes live. This project builds a Lasso‑regularised linear model that:
- Predicts the efficiency of a proposed spend mix, expressed as sales generated per $1 000 invested.
- Highlights which channels (or channel combinations) truly drive efficiency by shrinking unimportant predictors to zero, so planners can cut waste without guesswork.
Because Lasso couples an ℓ1 penalty with ordinary least squares, it delivers both interpretability and built‑in feature selection—ideal for marketing teams who need clear, actionable insights.
Libraries Required
| Purpose | Library |
| Data handling | pandas, numpy |
| Visualisation | matplotlib, seaborn |
| ML pipeline | scikit‑learn → Lasso, Pipeline, ColumnTransformer, StandardScaler, OneHotEncoder, GridSearchCV |
| Metrics | mean_squared_error, r2_score |
Dataset Link
Step-by-Step Code Implementation
1. Import Libraries
import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns from sklearn.compose import ColumnTransformer from sklearn.preprocessing import StandardScaler from sklearn.model_selection import train_test_split, GridSearchCV from sklearn.pipeline import Pipeline from sklearn.linear_model import Lasso from sklearn.metrics import mean_squared_error, r2_score
2. Download and load the dataset
The classic “Advertising” dataset records spend across TV, Radio, Newspaper and the resulting product sales for 200 campaigns.
# One‑time shell command (requires Kaggle API & key):
# kaggle datasets download -d yasserh/advertising-sales-dataset -p data --unzip
raw = pd.read_csv("data/Advertising.csv") # file name inside zip
3. Engineer the “efficiency” target
By dividing sales revenue by total spend, we obtain a unit-less efficiency metric (sales per $ 1,000). This aligns with how finance teams evaluate ROI.
raw['total_spend'] = raw[['TV', 'Radio', 'Newspaper']].sum(axis=1) raw['efficiency'] = raw['Sales'] / raw['total_spend'] * 1000 # sales per $1 000
4. Feature matrix (X) and target (y)
We include absolute spends and mix percentages, letting the model capture non‑linear “diminishing‑returns” effects through interaction of level and composition.
# Basic numeric features num_features = ['TV', 'Radio', 'Newspaper', 'total_spend'] # Optional relative‑mix features raw['tv_pct'] = raw['TV'] / raw['total_spend'] raw['radio_pct'] = raw['Radio'] / raw['total_spend'] raw['news_pct'] = raw['Newspaper'] / raw['total_spend'] num_features += ['tv_pct', 'radio_pct', 'news_pct'] X = raw[num_features] y = raw['efficiency']
5. Pre‑processing recipe
Only standardisation is required because every feature is numeric.
preprocess = ColumnTransformer(
[('num', StandardScaler(), num_features)],
remainder='drop'
)
6. Train/test split
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42)
7. Build & tune Lasso pipeline
Encapsulating scaling and modelling in a single pipeline prevents data leakage. A log‑spaced α grid (0.001–10) balances sparsity against fit; 5‑fold CV stabilises the hyper‑parameter choice.
pipe = Pipeline([
('prep', preprocess),
('model', Lasso(max_iter=10_000, random_state=42))
])
param_grid = {'model__alpha': np.logspace(-3, 1, 30)} # 0.001 → 10
search = GridSearchCV(pipe, param_grid, cv=5,
scoring='neg_root_mean_squared_error')
search.fit(X_train, y_train)
print("Optimal α:", search.best_params_['model__alpha'])
8. Evaluate on the hold‑out set
RMSE expresses the average prediction error in the same unit as the target (sales per $ 1,000). R2R^2R2 indicates explanatory power.
y_pred = search.predict(X_test)
rmse = mean_squared_error(y_test, y_pred, squared=False)
r2 = r2_score(y_test, y_pred)
print(f"Test RMSE: {rmse:.2f} sales per $1 000 | R²: {r2:.3f}")
9. Inspect feature importance
Coefficients that remain non‑zero after Lasso’s penalty reveal which spending channels—and which mix proportions—truly lift efficiency. For example, a positive coefficient on radio_pct but zero on news_pct suggests reallocating part of the print budget to radio could raise ROI.
coefs = search.best_estimator_.named_steps['model'].coef_
features = np.array(num_features)
importance = pd.Series(coefs, index=features).sort_values(key=abs, ascending=False)
plt.figure(figsize=(8,5))
importance.plot(kind='barh')
plt.gca().invert_yaxis()
plt.title('Top Drivers of Marketing Efficiency (Lasso Coefficients)')
plt.xlabel('Coefficient (Δ sales per $1 000)')
plt.show()
Summary
With less than 100 lines of Python code, we created a transparent, cross-validated pipeline that forecasts marketing efficiency before money is committed and ranks the spend levers with the most significant impact. Budget planners can plug proposed channel mixes into the model, see the expected sales return per $ 1,000, and adjust allocations until forecast efficiency meets the firm’s hurdle rate—turning gut-feel budgeting into data-driven optimisation.