Marketing Channel Efficiency Prediction with ElasticNet Algorithm in ML

FREE Online Courses: Elevate Skills, Zero Cost. Enroll Now!

Growth teams spread budget across search, social, display, email, and referral channels, but only learn each channel’s efficiency—“return‑per‑dollar‑spent” (ROAS)—after the campaign closes. Because spend, impressions, clicks, CPM, and audience size move together, classic OLS shoots wildly, while pure Lasso (ℓ¹) can throw out genuinely helpful metrics. An Elastic Net model (Ridge + Lasso) balances stability and sparsity, providing a transparent tool that forecasts ROAS for a proposed channel mix before money leaves the account.

Libraries Required

Role	Library
Data wrangling	pandas, numpy
Visuals	matplotlib, seaborn
ML workflow	scikit‑learn → ColumnTransformer, OneHotEncoder, StandardScaler, ElasticNet, GridSearchCV, Pipeline, train_test_split
Metrics	mean_squared_error, r2_score

Dataset

Online Advertising Digital Marketing Data

Step-by-Step Code Implementation

1. Import libraries

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder, StandardScaler
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.pipeline import Pipeline
from sklearn.linear_model import ElasticNet
from sklearn.metrics import mean_squared_error, r2_score

2. Download & load dataset

Dataset: each row is a Facebook ad set with spend, impressions, clicks, country, creative demographics, and conversions.

# one‑time terminal (requires Kaggle API token):
# kaggle datasets download -d naniruddhan/online-advertising-digital-marketing-data -p data --unzip

ads = pd.read_csv("data/online_ads.csv")  # adjust file name if needed

3. Target engineering – ROAS

Target: ROAS measures revenue/spend, the universal channel‑efficiency KPI.

VALUE_PER_CONVERSION = 100        # business assumption in USD

ads = ads[ads['Approved_Conversion'] > 0]        # drop zero‑conversion rows
ads['Revenue'] = ads['Approved_Conversion'] * VALUE_PER_CONVERSION
ads['ROAS']    = ads['Revenue'] / ads['Spent']
y = ads['ROAS']

4. Feature matrix

Elastic Net logic: impressions, clicks, spend, and conversions are collinear; the Ridge part keeps them, while the Lasso part trims rarely used audience dummies.

X = ads[['Channel', 'Country', 'Age', 'Gender',
         'Impressions', 'Clicks', 'Spent', 'Approved_Conversion', 'bidding_type']]

cat_cols = ['Channel', 'Country', 'Age', 'Gender', 'bidding_type']
num_cols = ['Impressions', 'Clicks', 'Spent', 'Approved_Conversion']

5. Elastic Net pipeline

Pipeline: encoding & scaling sit inside a Pipeline, preventing leakage and making deployment a one‑liner.

preprocess = ColumnTransformer([
        ('cat', OneHotEncoder(drop='first', sparse=False), cat_cols),
        ('num', StandardScaler(), num_cols)
    ])

pipe = Pipeline([
        ('prep', preprocess),
        ('enet', ElasticNet(max_iter=15000, random_state=42))
    ])

6. Train/test split & hyper‑parameter search

Grid search: 162 models (18 α × 9 mix ratios) are cross‑validated, selecting the lowest RMSE while keeping coefficients sparse.

X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=0.2, random_state=42, stratify=ads['Channel'])

param_grid = {
    'enet__alpha'   : np.logspace(-3, 1, 18),   # 0.001 → 10
    'enet__l1_ratio': np.linspace(0.1, 0.9, 9)  # Ridge‑heavy → Lasso‑heavy
}

gs = GridSearchCV(pipe, param_grid,
                  cv=5,
                  scoring='neg_root_mean_squared_error',
                  n_jobs=-1, verbose=1)
gs.fit(X_train, y_train)

print("Best α:", gs.best_params_['enet__alpha'])
print("Best l1_ratio:", gs.best_params_['enet__l1_ratio'])

7. Model evaluation

y_pred = gs.predict(X_test)
rmse = mean_squared_error(y_test, y_pred, squared=False)
r2   = r2_score(y_test, y_pred)

print(f"Hold‑out RMSE: {rmse:.3f} ROAS pts | R²: {r2:.3f}")

8. Interpret top coefficients

Interpretation: the coefficient chart highlights, e.g., Video Ads > 25‑34 add +0.12 ROAS, while high spend without proportional clicks drags efficiency.

# Recover feature names
ohe = gs.best_estimator_.named_steps['prep'].named_transformers_['cat']
ohe_names = ohe.get_feature_names_out(cat_cols)
feat_names = np.hstack([ohe_names, num_cols])

# Reverse‑scale numeric coefficients
scales = gs.best_estimator_.named_steps['prep'].named_transformers_['num'].scale_
coef   = gs.best_estimator_.named_steps['enet'].coef_
coef[-len(num_cols):] = coef[-len(num_cols):] / scales

imp = pd.Series(coef, index=feat_names).sort_values(key=abs, ascending=False).head(15)

plt.figure(figsize=(9,5))
imp.plot(kind='barh')
plt.gca().invert_yaxis()
plt.title('Elastic Net – Drivers of Channel ROAS')
plt.xlabel('Δ ROAS'); plt.tight_layout(); plt.show()

Summary

With ≈ 140 lines of Python, we built an Elastic Net “mixed” regression model that:

Forecasts pre‑launch ROAS so marketing can shift budget to the best‑performing channels.
Balances multicollinearity & sparsity – stable coefficients plus automatic dummy pruning.
Provides clear levers for optimisation (channel type, creative, audience).

Upload the next month’s ad CSV, call gs.fit(new_X, new_y), and the efficiency forecaster refreshes—keeping spend decisions data‑driven.

We work very hard to provide you quality material
Could you take 15 seconds and share your happy experience on Google | Facebook