L1 (Lasso) and L2 (Ridge) Regression Practice¶

I used the California Housing Dataset (already built into SKLearn) to pratice L1 and L2 Regression. This dataset contains features of houses and thier sold prices.

I first used to a simple regression model, then compared the coefficients of the model once L1 and L2 Regression had been performed.

Regularization penalizes models for overfitting by adding a “penalty term” to the loss function.

Import Libraries¶

In [1]:
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import Lasso
from sklearn.linear_model import Ridge
from sklearn.model_selection import GridSearchCV
from sklearn.datasets import fetch_california_housing

Import California Housing Dataset¶

In [2]:
housing = fetch_california_housing()
In [3]:
housing_df = pd.DataFrame(data=housing.data, columns=housing.feature_names)
housing_df['Median Price (in $100k)'] = housing.target
housing_df.head()
Out[3]:
MedInc HouseAge AveRooms AveBedrms Population AveOccup Latitude Longitude Median Price (in $100k)
0 8.3252 41.0 6.984127 1.023810 322.0 2.555556 37.88 -122.23 4.526
1 8.3014 21.0 6.238137 0.971880 2401.0 2.109842 37.86 -122.22 3.585
2 7.2574 52.0 8.288136 1.073446 496.0 2.802260 37.85 -122.24 3.521
3 5.6431 52.0 5.817352 1.073059 558.0 2.547945 37.85 -122.25 3.413
4 3.8462 52.0 6.281853 1.081081 565.0 2.181467 37.85 -122.25 3.422

Prepare the Data¶

In [4]:
X = housing.data
y = (housing.target > 2.0).astype(int)  # Binary classification: True if target > 2.0, False otherwise

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

feature_names = ['MedInc','HouseAge','AveRooms','AveBedrms','Population','AveOccup','Latitude','Longitude']

Fit A Simple Linear Regression Model, then Find the Coefficients¶

In [5]:
model = LogisticRegression()
model.fit(X_train_scaled, y_train)
Out[5]:
LogisticRegression()
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
LogisticRegression()
In [6]:
coefficients = model.coef_[0]
In [7]:
plt.figure(figsize=(10, 6))
plt.bar(feature_names, coefficients)
plt.ylabel('Coefficient')
plt.title('Coefficients of Logistic Regression Model')
plt.show()

L1 Regularization (aka Lasso 'Least Absolute Shrinkage and Selection Operator')¶

Regularizaion addresses overfitting... a model that is performs well in training data but performs poor in test data (due to it being over-complex). We get rid of less useful features by turning the coefficient of the least important variables to 0 (automatic feature selection). The model is now simplier, and less prone to overfitting. https://www.youtube.com/watch?v=LmpBt0tenJE

In [8]:
lasso = Lasso()
In [9]:
param_grid = {
    'alpha':[0.00001, 0.0001,0.001,0.01,0.1,1,10,100]
}
In [10]:
lasso_cv = GridSearchCV(lasso, param_grid, cv=3, n_jobs = -1)
In [11]:
lasso_cv.fit(X_train,y_train)
Out[11]:
GridSearchCV(cv=3, estimator=Lasso(), n_jobs=-1,
             param_grid={'alpha': [1e-05, 0.0001, 0.001, 0.01, 0.1, 1, 10,
                                   100]})
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
GridSearchCV(cv=3, estimator=Lasso(), n_jobs=-1,
             param_grid={'alpha': [1e-05, 0.0001, 0.001, 0.01, 0.1, 1, 10,
                                   100]})
Lasso()
Lasso()
In [12]:
lasso_cv.best_estimator_
Out[12]:
Lasso(alpha=1e-05)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
Lasso(alpha=1e-05)
In [13]:
lasso1 = Lasso(alpha = 0.00001)
In [14]:
lasso1.fit(X_train, y_train)
Out[14]:
Lasso(alpha=1e-05)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
Lasso(alpha=1e-05)
In [15]:
plt.figure(figsize=(10, 6))
plt.bar(feature_names, lasso1.coef_)
plt.ylabel('Coefficient')
plt.title('Coefficients of Model (with L1 Regularization)')
plt.show()

L2 Regularization (aka Ridge Regression)¶

In this model, coefficients of less useful features are reduced - but do not always go to 0.

In [16]:
ridge = Ridge()
In [17]:
param_grid = {
    'alpha':[0.00001, 0.0001,0.001,0.01,0.1,1,10,100]
}
In [18]:
ridge_cv = GridSearchCV(ridge, param_grid, cv=3, n_jobs = -1)
In [19]:
ridge_cv.fit(X_train,y_train)
Out[19]:
GridSearchCV(cv=3, estimator=Ridge(), n_jobs=-1,
             param_grid={'alpha': [1e-05, 0.0001, 0.001, 0.01, 0.1, 1, 10,
                                   100]})
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
GridSearchCV(cv=3, estimator=Ridge(), n_jobs=-1,
             param_grid={'alpha': [1e-05, 0.0001, 0.001, 0.01, 0.1, 1, 10,
                                   100]})
Ridge()
Ridge()
In [20]:
ridge_cv.best_estimator_
Out[20]:
Ridge(alpha=1e-05)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
Ridge(alpha=1e-05)
In [21]:
ridge1 = Ridge(alpha = 0.00001)
In [22]:
ridge1.fit(X_train, y_train)
Out[22]:
Ridge(alpha=1e-05)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
Ridge(alpha=1e-05)
In [23]:
plt.figure(figsize=(10, 6))
plt.bar(feature_names, ridge1.coef_)
plt.ylabel('Coefficient')
plt.title('Coefficients of Model (with L2 Regularization)')
plt.show()