'Why my do-it-myself Lasso gives different solution paths from the scikit-learn's lasso_path?
I tried recreating Lasso from scratch in Python with some modifications to the Lasso class as below:
class custom_lasso():
def __init__(self, lr, iter, lamb):
self.eta = lr
self.max_iter = iter
self.alpha = lamb
# Function for model training
def fit(self, X, Y):
# no_of_training_examples, no_of_features
self.n, self.p = X.shape
# weight initialization
self.B, self.X, self.Y = zeros(self.p), X, Y
# gradient descent learning
for i in range(self.max_iter):
self.update_weights(i)
return self
# Helper function to update weights in gradient descent
def update_weights(self, j):
# calculate gradients
dB = self.X.transpose() @ (self.X @ self.B - self.Y)
if j:
dB += abs(self.B)/self.B * self.alpha
# update weights
self.B -= self.eta * dB
return self
given that I aim to minimize 0.5 sum_{i=1}^n (y_i - sum_{j=1}^p x_{ij} b_j)^2 + alpha * sum_{j=1}^p abs(b_j) w.r.t. to all the b_j.
Problem: If I try to compare the solution paths with scikit-learn's lasso_path, I found the resulting coefficients are quite different. For example, if I set n = 100, p = 5 and alpha = 1, then lasso_path gives b_2 = 0 but the scratchy lasso I came up with gives b_2 = 0.5962:
How come the differences? Any more codes I should include? Thanks a lot.
Solution 1:[1]
You're doing gradient descent on the objective function, but the objective function is not smooth because of the L1 penalty.
I don't get why there's an if here (and it does not seem to be used in your code, meaning that you just solve ordinary least squares if eta is small enough):
if j:
dB += abs(self.B)/self.B * self.alpha
In order to obtain a converging solver, you need to apply proximal gradient descent, i.e. apply a soft-thresholding step after the gradient descent step:
gd_update = B - eta * X.T @ (X @ B - y)
B = np.maximum(0, np.abs(gd_update) - alpha * eta)
For it to converge, you need eta < 2 / np.linalg.norm(X, ord=2)**2, as for standard gradient descent.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | P. Camilleri |

