'>90% of the time is spent on method 'acquire' of 'thread.lock' objects

To identify the step that is using most of the computation time, I ran cProfile and got the following result:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.014    0.014  216.025  216.025 func_poolasync.py:2(<module>)
    11241  196.589    0.017  196.589    0.017 {method 'acquire' of 'thread.lock' objects}
      982    0.010    0.000  196.532    0.200 threading.py:309(wait)
     1000    0.002    0.000  196.498    0.196 pool.py:565(get)
     1000    0.005    0.000  196.496    0.196 pool.py:557(wait)
515856/3987    0.350    0.000   13.434    0.003 artist.py:230(stale)

The step on which most of the time is spent clearly seems to be method 'acquire' of 'thread.lock' objects. I have not used threading; rather I've used pool.apply_async with several processors, so I'm confused why thread.lock is the issue?

I am hoping to have some light shed on why this is the bottleneck? And how to bring down this time?

Here's how the code looks:

path='/usr/home/work'
filename='filename'

with open(path+filename+'/'+'result.pickle', 'rb') as f:
     pdata = pickle.load(f)

if __name__ == '__main__':
    pool = Pool()    
    results=[]
    data=list(range(1000))
    print('START')
    start_time = int(round(time.time()))
    result_objects = [pool.apply_async(func, args=(nomb,pdata[0],pdata[1],pdata[2])) for nomb in data]

    results = [r.get() for r in result_objects]

    pool.close()
    pool.join()
    print('END', int(round(time.time()))-start_time)

Revision:

By switching from pool.apply_async to pool.map I am able to reduce the execution time by ~3 times.

Output:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.113    0.113   70.824   70.824 func.py:2(<module>)
     4329   28.048    0.006   28.048    0.006 {method 'acquire' of 'thread.lock' objects}
        4    0.000    0.000   28.045    7.011 threading.py:309(wait)
        1    0.000    0.000   28.044   28.044 pool.py:248(map)
        1    0.000    0.000   28.044   28.044 pool.py:565(get)
        1    0.000    0.000   28.044   28.044 pool.py:557(wait)

Modification to code:

if __name__ == '__main__':
    pool = Pool()    
    data=list(range(1000))

    print('START')
    start_time = int(round(time.time()))
    funct = partial(func,pdata[0],pdata[1],pdata[2])
    results = pool.map(funct,data)
    print('END', int(round(time.time()))-start_time)

But, it has been found that some of the iterations result in nonsensical results. I'm not sure why this is happening, however, it is seen that 'method 'acquire' of 'thread.lock' objects' is still the rate determining step.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source