'Error: replacement has length zero

I have 2 dataframes namely, df1 having some 2000 datapoints and 100 columns. I have created df2 which is a replication of df1 and filling df2 by performing some calculations on df1.Here is my code:

for(i in 1:ncol(df1)){
  for(j in 1:nrow(df1)-9){df2[i,j] = (df1[i,j+9]/df1[i,j]) -1}
}

Error in [<-.data.frame(*tmp*, 1, j, value = numeric(0)) : replacement has length zero

I am getting the error: replacement has length zero. Can anybody please suggest me the issue with the above code.

r


Solution 1:[1]

It seems that you have transposed "nrow" and "ncol" in the conditions. The inner loop goes through your columns but with index nrow-9. As you have much more rows as columns the loop reaches the last columns and the calculation with column+1 is not possible anymore. Therefore, the replacement is 0.

Using this code it should work:

for(i in 1:nrow(df1)){
  for(j in 1:ncol(df1)-9){df2[i,j] = (df1[i,j+9]/df1[i,j]) -1}
}

Doing so, you would do the calculation on all rows of the first 91 columns. Is this what you want to do?

Solution 2:[2]

As @AK88 mentioned, the problem with your loop is the resolution left-to-right:

try to put this nrow(df1)-9 into brackets (nrow(df1)-9)

You are essentially executing 1:nrow(df1) and then subtract 9.

In addition, I thought I might mention that R has a lot of list-based helpers that execute such statements much faster, although they require some time to get used to. Look into the apply family of functions and Hadley's Advanced R for more information.

library(dplyr)

## example data
v <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
df1 <- data_frame(v1 = v, v2 = v, v3 = v, v4 = v, v5 = v, v6 = v, v7 = v, v8 = v, 
  v9 = 2 * v, v10 = 3 * v, v11 = 4 * v, v12 = 5 * v, v13 = 6 * v, v14 = 2 * 
    v, v15 = 3 * v, v16 = 4 * v, v17 = 5 * v, v18 = 8 * v, v19 = 2 * v, 
  v20 = 10 * v)

df2 <- data_frame()

system.time(for (i in 1:ncol(df1)) {
  for (j in 1:(nrow(df1) - 9)) {
    df2[i, j] = (df1[i, j + 9]/df1[i, j]) - 1
  }
})
#>    user  system elapsed 
#>   0.472   0.008   0.484

## a good bit faster (although negligible at this size)
system.time(tmp <- mapply(function(x, y) {
  (x/y) - 1
}, df1[, (9 + 1):nrow(df1)], df1[, 1:(nrow(df1) - 9)]) %>% as_data_frame())
#>    user  system elapsed 
#>   0.000   0.000   0.003

identical(tmp, df2)
#> [1] TRUE

For future reference, including a sample dataset in your question and using the reprex package can make it easier for others to help you.

UPDATE: Per further discussion, it seems your row and column mix-up was unintentional (the original statement of the problem would likely require a square dataset or something similar). Reversing your row / column indices, or your nrow / ncol statements will solve that issue.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2