'Parameter estimation in logistic model by negative log-likelihood minimization - R

I am currently attempting to estimate the parameters of a logistic regression model "by hand" on the iris dataset via minimisation of cross-entropy. Please note, when I say iris dataset, it has been changed such that there are only two classes - Setosa and Other. It was also normalised via the scale function:

library(dplyr)
library(optimx)
iriso <- iris %>%
  mutate(Species = ifelse(Species == "setosa", "setosa", "other")) %>%
  mutate(Species_n = ifelse(Species == "setosa", 1, 0)) %>%
  as.data.frame()

iriso[,1:4] <- scale(iriso[,1:4])

Based on my understanding: should everything be correct, when the optimisation is completed we should obtain the same set of parameters each time - no matter the starting point for the optimisation algorithm. However, the parameter estimates following optimisation are bouncing around:

Some functions being defined:

X <- model.matrix(~.,data=iriso[,1:4])
Y <- model.matrix(~0+Species_n,data=iriso)

e <- exp(1)
sigmoid <- function(y){
  1/(1 + e^-y)
}

#w is an array of weights. x is matrix of observations
logistique <- function(w, x){
  sigmoid(
    y = w[1]*x[,1] + w[2]*x[,2] + w[3]*x[,3] + w[4]*x[,4] + w[5]*x[,5]
    )
}

#y is obsrved values
entropie <- function(w, y, x){
  prob_pred <- logistique(w = w, x = x)

  -sum(
    y*log(prob_pred) + (1-y)*log(1-prob_pred)
  ) 
}

Optimisation step:

for(i in 1:5){
  w0 <- rnorm(n = 5) #set of initial parameters
  optimx(par = w0, fn = entropie,  
         method = "Nelder-Mead",
         y = iriso$Species_n, x = X) %>%
    print()
}

I can not seem to understand why I am not obtaining consistent answers. Is there something wrong with the code above? Is there a concept I am not aware of? Something I missed?

Thanks.

r optimization data-science logistic-regression

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'Parameter estimation in logistic model by negative log-likelihood minimization - R

Sources

Related Questions