'spatial panel regression in R: non conformable spatial weights?

I am trying to run a spatial panel regression in R with the splm package. So I have polygons with summarized data over time and I want to see how the dependent variable is affected by the other variables that also change over time.

I have 546 regions with a number of variables, but to test how it works I took a subset of my data for 3 polygons, including the shapefile for calculating the weights, and the data.

https://drive.google.com/file/d/0B4SK0f2zZUKxZ0dDU2lnclB2M3c/view?usp=sharing

#load data
file="sector_panel_data_test.csv"
sector_data=read.table(file,sep=",", header=T, quote="")
sector_data[is.na(sector_data)] <- 0
names(sector_data)
attach(sector_data)

#load shape
require (rgdal)
sectors <-readOGR(dsn=".",layer="sectors_test_sample_year1")
nb <- poly2nb(sectors)


#distance based neighbors
coords <- coordinates(sectors)
nb.d125<- dnearneigh(coords,0,125000,row.names=sectors$Code)

#create weights matrix
mat.d125 <-nb2mat(nb.d125,glist=NULL,style="W",zero.policy=TRUE)

#and then a weights list object
listd125 = mat2listw(mat.d125, style="W")

#design model and run, just picked one variable here
fm <- prop_fdeg ~ mean_pop
randommodel <-spml(fm, 
data=sector_data,index=NULL,listw=listFQQ,model="random", lag=FALSE)

I get the following error:

Error in spreml(formula = formula, data = data, index = index, w = listw2mat(listw), : Non conformable spatial weights

Does anyone know what this means? I have searched everywhere, and only found people with the same problem looking for a solution.



Solution 1:[1]

I've also just received the same error. After going step-by-step through the source code with my data (see below), it appears that missingness in the panel data leads to listwise deletion of some rows. Those deleted rows will, in turn, result in the panel data and the listw object having different numbers of observations. To fix the issue, you'll need to either (1) impute the missing data, (2) delete the dropped rows from your listw object or (3) drop variables from your model that have missingness. In my case, imputing all missing data seemed to stop the error. You'll also need to be attentive to keeping the panel data balanced as splm will break with most unbalanced panel data, too (the make.pbalanced command in plm does not seem to be much help with addressing imbalance in the data because it will add rows with NAs that splm will reject).

A handful of ways to check for missingness, impute missing data and/or see how your data works in the source code:

  1. compare dim(your_data) and dim(na.omit(your_data))

  2. Visualize missingness in your panel data with naniar (also, see the new panelView package)

    install.packages("naniar") # visualise missing data 
    library(ggplot2)
    library(naniar)
    gg_miss_var(your_data)  
    
  3. Run plm (not splm) on your data and check dimensions of the data in the output (as compared with the original data).

    p_out <- plm(formula = your_formula, data = your_data, model = "within") 
    summary(p_out)
    dim(model.matrix(p_out)) 
    
  4. A straightforward way to impute missing data is with the simputation package. For more see https://cran.r-project.org/web/packages/simputation/vignettes/intro.html

  5. The Amelia package offers better options for multiple imputation with time series, cross sectional data: https://gking.harvard.edu/amelia

  6. Run the underlying code for spreml at R-Forge directly. First, looking at the code below suggests that the error is being generated by the last line. Just above this line, we can see how n is defined and this, at least, suggests some possible avenues for debugging (by running the underlying code directly to see where dim(w) is different from n (where w <- your_listw_object).

    ## data management through plm functions
    pmod <- plm(formula, data, index=index, model="pooling")
    X <- model.matrix(pmod)
    y <- pmodel.response(pmod)
    
    #names(index) <- row.names(data)
    #ind <- index[which(names(index) %in% row.names(X))]
    #tind <- tindex[which(names(index) %in% row.names(X))]
    
    ind <- attr(pmod$model, "index")[, 1]
    tind <- attr(pmod$model, "index")[, 2]
    oo <- order(tind, ind)
    X <- X[oo, , drop=FALSE]
    y <- y[oo]
    ind <- ind[oo]
    tind <- tind[oo]
    n <- length(unique(ind))
    k <- dim(X)[[2]]
    t <- max(tapply(X[, 1], ind, length))
    nT <- length(ind)
    
    ## check compatibility of weights matrix
    if (dim(w)[[1]] != n) stop("Non conformable spatial weights")
    

Solution 2:[2]

This might not be relevant to your problem, but hopefully can help others searching for this error.

Data have to be in specific format: first two columns containing index and time in this order and rest is remaining variables. Switching time and index will cause Non conformable spatial weights because dim(w) != n, where $n$ will be number of unique elements of time.

Solution 3:[3]

I had the same problem myself.

It turned out that my spatial weight matrix contained one extra country.

For e.g., your dataset contains 33 countries, but you have a matrix of 34 countries.

Simply remove that extra country from the matrix

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 jyr
Solution 3 Heman Hemanovich