'for loop in R when we have LOOCV
I have two for loops in R with a data around 150000 observation. I tried apply() family functions but they were slower than for loop in my case. here is my code:
where k=500 and N= 150000, x is location at each time t (for all observation) and xm is specific x with a specific coordination that I filtered here. At each time j we observe xm so we remove it from the data and fit the model with the rest of dataset. I had an if else condition here that removed it in order to make the loop faster.
It's so slow, I am so thankful for your help!
xs = 0:200
result= matrix(0, k,N )
for (j in 1: N){
for ( i in 1:k){
a <- sum(dnorm(xs[i],xm[-j],bx))
b <- sum(dnorm(xs[i],x[-ind[j]],bx))
result[i,j]<-a/b
}
}
Solution 1:[1]
Using dummy values ind, x, and xm, here is a solution that runs in about 10 seconds on my machine (>1000 times faster than the original code).
# start with a small N for verification
N <- 15e2L
xm <- runif(N)
x <- runif(N)
ind <- sample(N)
k <- 501L
xs <- 0:500
bx <- 2
system.time({
# proposed solution
a <- outer(xs, xm, function(x, y) dnorm(x, y, bx))
b <- outer(xs, x[ind], function(x, y) dnorm(x, y, bx))
result1 <- (rowSums(a) - a)/(rowSums(b) - b)
})
#> user system elapsed
#> 0.08 0.02 0.10
system.time({
# OP's solution
result2 <- matrix(0, k, N)
for (j in 1:N){
for (i in 1:k){
a <- sum(dnorm(xs[i], xm[-j], bx))
b <- sum(dnorm(xs[i], x[-ind[j]], bx))
result2[i,j] <- a/b
}
}
})
#> user system elapsed
#> 109.42 0.80 110.90
# check that the results are the same
all.equal(result1, result2)
#> [1] TRUE
# use a large N
N <- 15e4L
xm <- runif(N)
x <- runif(N)
ind <- sample(N)
system.time({
a <- outer(xs, xm, function(x, y) dnorm(x, y, bx))
b <- outer(xs, x[ind], function(x, y) dnorm(x, y, bx))
result1 <- (rowSums(a) - a)/(rowSums(b) - b)
})
#> user system elapsed
#> 8.62 1.10 9.73
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
