Student Grade Curve Prediction using Polynomial Regression in ML

FREE Online Courses: Elevate Skills, Zero Cost. Enroll Now!

Educational institutions often adjust raw exam scores to fit a grade curve, mapping individual performance onto standard distributions. We want to build a model that predicts a student’s final grade (G3) based on their first (G1) and second (G2) period grades, along with key background factors (study time, failures, family support, etc.). The relationship between these inputs and the final grade is nonlinear—for instance, improvements from G1 to G2 may have diminishing returns—so simple linear regression underfits.

Polynomial Regression helps us capture these curvatures and interactions to accurately forecast final grades. This allows educators to anticipate outcomes and tailor interventions.

Libraries Required

import pandas as pd                     # for data handling  
import numpy as np                      # for numerical operations  

from sklearn.model_selection import train_test_split, GridSearchCV  
from sklearn.preprocessing import StandardScaler, PolynomialFeatures, OneHotEncoder  
from sklearn.compose import ColumnTransformer  
from sklearn.linear_model import Ridge  
from sklearn.pipeline import Pipeline  
from sklearn.metrics import mean_squared_error, r2_score  

import matplotlib.pyplot as plt         # for plotting  
import seaborn as sns                   # for enhanced visualisation

Dataset

Student Alcohol Consumption

Step-by-Step Code Implementation

1. Load Data & Libraries

import pandas as pd
df = pd.read_csv("data/student-mat.csv")   # math course file

# Display relevant columns
df[['G1','G2','G3','studytime','failures','schoolsup','famsup']].head()

2. Exploratory Data Analysis

import seaborn as sns, matplotlib.pyplot as plt

# Correlation heatmap for numeric predictors
sns.heatmap(df[['G1','G2','G3','studytime','failures']].corr(), annot=True, cmap='Blues')
plt.title("Numeric Feature Correlations")
plt.show()

3. Define Features & Target

# Select predictors and target
numeric_features = ['G1','G2','studytime','failures']
categorical_features = ['schoolsup','famsup','paid','higher','internet']
X = df[numeric_features + categorical_features]
y = df['G3']    # final grade

4. Build Pipeline with Polynomial Features

OneHotEncoder converts categorical supports (schoolsup, famsup) into binary flags, then PolynomialFeatures can generate interactions between these flags and numeric scores.
Polynomial Features augments scaled numeric inputs with their squares and pairwise products, capturing nonlinearities (e.g., the square of G2 captures diminishing returns).
The Standard Scaler ensures that all numeric inputs contribute equally to the penalty.
Ridge regression (ℓ² penalty) stabilises coefficient estimates in this expanded feature space, preventing overfitting.

from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import StandardScaler, PolynomialFeatures, OneHotEncoder
from sklearn.linear_model import Ridge
from sklearn.pipeline import Pipeline

preprocessor = ColumnTransformer([
    ("num", StandardScaler(), numeric_features),
    ("cat", OneHotEncoder(drop='first'), categorical_features)
])

pipe = Pipeline([
    ("prep", preprocessor),
    ("poly", PolynomialFeatures(include_bias=False)),
    ("ridge", Ridge())
])

5. Train/Test Split & Hyperparameter Search

GridSearchCV tunes the polynomial degree (1–3) and Ridge α (10⁻³–10³) using 5‑fold CV to minimise RMSE on held‑out folds.

from sklearn.model_selection import train_test_split, GridSearchCV
import numpy as np

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

param_grid = {
    "poly__degree": [1, 2, 3],
    "ridge__alpha": np.logspace(-3, 3, 7)
}

gs = GridSearchCV(
    pipe, param_grid,
    cv=5, scoring="neg_root_mean_squared_error",
    n_jobs=-1, verbose=1
)
gs.fit(X_train, y_train)

print("Best parameters:", gs.best_params_)

6. Evaluate Model

from sklearn.metrics import mean_squared_error, r2_score

y_pred = gs.predict(X_test)
rmse = mean_squared_error(y_test, y_pred, squared=False)
r2   = r2_score(y_test, y_pred)

print(f"Test RMSE: {rmse:.2f} grade points")
print(f"Test R²  : {r2:.3f}")

7. Inspect Key Coefficients

# Extract names of polynomial features
poly = gs.best_estimator_.named_steps["poly"]
num_names = numeric_features
cat_names = gs.best_estimator_.named_steps["prep"] \
                   .named_transformers_["cat"] \
                   .get_feature_names_out(categorical_features)
feature_names = poly.get_feature_names_out(
    input_features=np.hstack([num_names, cat_names])
)

# Retrieve coefficients
coefs = gs.best_estimator_.named_steps["ridge"].coef_

# Show top 10 by absolute value
import pandas as pd
coef_series = pd.Series(coefs, index=feature_names)
coef_series.abs().sort_values(ascending=False).head(10).plot(
    kind="barh", figsize=(8,5)
)
plt.gca().invert_yaxis()
plt.title("Top Polynomial Features Influencing Final Grade")
plt.xlabel("Coefficient magnitude")
plt.tight_layout()
plt.show()

Summary

By combining polynomial feature engineering with Ridge regularisation in a unified pipeline, we achieve:

Accurate prediction of final course grade (G3) from earlier grades and support factors (low RMSE, high R²).
Nonlinear modelling of educational progress (capturing curvature in the G1→G2→G3 relationship).
Interpretable drivers—the largest coefficient terms (e.g., G2² or G1 × studytime) highlight critical levers for academic interventions and resource planning.

Did you know we work 24x7 to provide you best tutorials
Please encourage us - write a review on Google | Facebook

Student Grade Curve Prediction using Polynomial Regression in ML

Libraries Required

Dataset

Step-by-Step Code Implementation

1. Load Data & Libraries

2. Exploratory Data Analysis

3. Define Features & Target

4. Build Pipeline with Polynomial Features

5. Train/Test Split & Hyperparameter Search

6. Evaluate Model

7. Inspect Key Coefficients