'Distibuted Computing in Julia Slower than Serial
I have a julia function that seems very amenable to optimization. Each iteration only manipulates the stuff in its particular index. Yet this function, when implemented with distributed as below, is slower than its serial equivalent. I have tried an equivalent implementation with Distributed instead of Shared arrays, and it is even slower. There must be something simple I am missing here, but I cannot figure it out.
function f(A1, A2, I1, I2, n1, n2, n3)
B1 = convert(SharedArray, zeros(n1, n2))
B2 = convert(SharedArray, zeros(n2, n3))
@sync @distributed for d in 1:n2
for i in 1:n3
B1[d, i] = A1[I1[d], I2[d][i]] / (A1[I1[d], I2[d][i]] + A2[I1[d], I2[d][i]]))
B2[:, d] .+= log.(A2[:, I2[d]);
end
B2[:, d] .-= logsumexp(B2[:, d])
end
B1 = convert(Array, B1)
B2 = convert(Array, B2)
B2 = exp.(B2)
return B1, B2
end
Solution 1:[1]
The amount of compute you're trying to distribute is likely much too small. Remember, all distributed computing has overhead of sending data back and forth between different processes, and that has a rather significant amount which needs to be overcome in order to actually speedup.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Chris Rackauckas |