'Python - ValueError: operands could not be broadcast together with shapes (17,90) (17,)

I am trying to implement logistic regression with regularization in Python using optimize.minimize from the SciPy library. Here is my code:

import pandas as pd
import numpy as np
from scipy import optimize

l = 0.1 # lambda

def sigmoid(z):

    return 1 / (1 + np.exp(-z))

def cost_function_logit(theta, X, y, l):

    h = sigmoid(X @ theta)

    # cost

    J = -1 / m * (y.T @ np.log(h)
                 + (1 - y).T @ np.log(1 - h)) \
                 + l / (2 * m) * sum(theta[1:] ** 2)

    # gradient

    a = 1 / m * X.T @ (h - y)
    b = l / m * theta
    grad = a + b
    grad[0] = 1 / m * sum(h - y)

    return J, grad

data = pd.read_excel('Data.xlsx')

X = data.drop(columns = ['healthy'])
m, n = X.shape
X = X.to_numpy()
X = np.hstack([np.ones([m, 1]), X])

y = pd.DataFrame(data, columns = ['healthy'])
y = y.to_numpy()

initial_theta = np.zeros([n + 1, 1])

options = {'maxiter': 400}
res = optimize.minimize(cost_function_logit,
                        initial_theta,
                        (X, y, l),
                        jac = True,
                        method = 'TNC',
                        options = options)

An error occurs on the line where I use optimize.minimize. The last two lines of the error are as follows:

grad = a + b

ValueError: operands could not be broadcast together with shapes (17,90) (17,)

I have checked the type and dimensions of X, y and theta, and they seem correct to me.

>>> type(X)
<class 'numpy.ndarray'>
>>> type(y)
<class 'numpy.ndarray'>
>>> type(theta)
<class 'numpy.ndarray'>
>>> X.shape
(90, 17)
>>> y.shape
(90, 1)
>>> theta.shape
(17, 1)

The error says a is a (17,90) matrix but based on my calculations it should be a (17,1) vector. Does anyone know where I went wrong?

Solution 1:^[1]

The elements of a are 90 dimensional vectors, whereas the elements of b are numbers. I'm not totally sure what you're trying to do, but if you want to add vectors, they need to have the same shape. If you want to add the thing in b to each element in a row-wise you can do

grad = a + np.stack((b,) * a.shape[1], axis=-1)

but I'm assuming you just are messing up constructing a.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	CasualScience

'Python - ValueError: operands could not be broadcast together with shapes (17,90) (17,)

Solution 1:[1]

Sources

Related Questions

Solution 1:^[1]