'Can multiprocessing.Queue replace Manager.list() in python?
I am working on a project where I am using multiprocessing and trying to achieve the minimal time. (I have tested that my one process takes around 4secs and if there are 8 processes working in parallel they should take around the same time or lets say around 6 to 7secs at max.
In the list of arguments, A Manager.List() (lets call it main_list) is a common argument that is passed to each process to append a list in the main_list after processing a txt file ( includes conversions, transformations and multiplications of hex data).
Same procedure is followed in all 8 processes.
By using Manager.List(), it was taking around 22 secs. I wanted a way around so I could reduce this time. Now, I am using Queue to achieve my goal but it seems like that the queue will not be effective for this method?
def square(x, q):
q.put((x,x*x))
if __name__=='__main__':
qout = mp.Queue()
processes=[]
t1=time.perf_counter()
for i in range(10):
p = mp.Process(target=square, args=(i, qout))
p.start()
processes.append(p)
for p in processes:
p.join()
unsorted_result = [qout.get() for p in processes]
result = [t[1] for t in sorted(unsorted_result)]
t2=time.perf_counter()
print(t2-t1)
print(result)
OUTPUT
0.7646916
I want to be sure if i can consider using Queue this way instead of Manager.list() to reduce this time.
I am sorry for not sharing the actual code.
Solution 1:[1]
See my comment to your question. This would be the solution using a multiprocessing pool with method map:
from multiprocessing import Pool
def square(x):
return x * x
if __name__=='__main__':
# Create a pool with 10 processes:
pool = Pool(10)
result = pool.map(square, range(10))
print(result)
pool.close()
pool.join()
Prints:
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
The managed list that you were using is represented by a proxy object. Every append operation you do on that list results in a message being sent to a thread running in a process started by the multiprocessing.SyncManager instance that was created when you presumably called multiprocessing.Manager(). It is in that process where the actual list resides. So managed lists are generally not the most efficient solution available.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
