'Run time parallel loop in R
I have access to a quite powerful remote desktop where I ran some parallel code in R. However, I am surprised about the results when I compare the running time between the parallel loop and the simple for loop. Below is my code.
library(rbenchmark)
library(foreach)
library(doParallel)
max.eig <- function(N, sigma) {
d <- matrix(rnorm(N**2, sd = sigma), nrow = N)
E <- eigen(d)$values
abs(E)[[1]]
}
no_cores <- 20
registerDoParallel(cores=no_cores)
cl <- makeCluster(no_cores)
benchmark(
foreach(n = 1:50) %do% max.eig(n, 1),
foreach(n = 1:50) %dopar% max.eig(n, 1)
)
The speed comparison gives
test replications elapsed relative user.self sys.self user.child sys.child
foreach(n = 1:50) %do% max.eig(n, 1) 100 3.097 1.000 3.081 0.003 0.000 0.000
foreach(n = 1:50) %dopar% max.eig(n, 1) 100 8.327 2.689 0.943 6.643 3.648 27.523
Notice that the running time of the parallel loop is much longer. This is true regardless of the number of cores I register. In contrast, when I run the comparison on my own computer, the parallel for loop is much faster (with 4 cores). The results give
test replications elapsed relative user.self sys.self user.child sys.child
foreach(n = 1:70) %do% max.eig(n, 1) 100 11.36 1.939 11.27 0.03 NA NA
foreach(n = 1:70) %dopar% max.eig(n, 1) 100 5.86 1.000 2.97 0.35 NA NA
How is it possible that the parallel loop is slower on the remote desktop? And is there a way to set up the parallel cluster differently so that it is much faster and I can exploit the many cores on the remote computer?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
