'What is the best way to parallelise this code?
I have a function that I would like to maximise using python. However, the evaluation of this function is fairly slow and I would like to find a way to speed it up by parallelising this code. I'm not very familiar with how to do this, so any help would be appreciated.
In short, I have a cost() function that is then optimised using some library e.g. scipy.optimize. This function evaluates some other function do_calculation() 100 times and then averages these results (the method of this other function doesn't matter, but the results have some statistical spread). This average is the quantity I want to maximise. However, evaluating it is actually quite expensive especially 100s of times, so I would like to parallelise the evaluation of that mean. How could I go about doing this in an efficient way?
def function():
val_list = []
for i in range(100):
val = do_calculation()
val_list.append(val)
return np.mean(np.array(val_list))
I was thinking about using multiprocessing to split up this loop, but then how do I rejoin all the values on different processors to calculate a final mean?
Solution 1:[1]
Using multiprocessing you can use a process pool to map your function to a list of possible arguments
from multiprocessing import Pool
def do_calculation(idx):
pass # write your code here
N_PROC=2
def function():
val_list = []
with Pool(N_PROC) as p:
val_list = p.map(do_calculation, range(100))
return np.mean(np.array(val_list))
Actually for this case, where you are computing the average (and that does not depend on the order), you could also use imap_unordered, in some cases may be advantageous, if the jobs take considerably different times.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Bob |
