'Parallelize the other loop of a nested for loop using allgather
I am trying to parallelize a nested for loop below using allgather
for (int i=0; i<N1; i++) {
for (int j=0; j<N0; j++)
HS_1[i] += IN[j]*W0[j][i];
}
Here N1 is 1000 and N2 is 764.
I have four processes and I just want to parallelize the outer loop. Is there a way to do it?
Solution 1:[1]
This looks like a matrix-vector multiplication. Let's assume that you've distributed the HS output vector. Each component needs the full IN vector, so you indeed need an allgather for that. You also need to distribute the W0 matrix: each process gets part of the i indices, and all of the j indices.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Victor Eijkhout |
