'How to use threading on dictionaries to improve time complexity?

I am new to threading. What I know is we can call threads on functions, but I want to call it on a dictionary.

I have a dictionary which have random numbers present in different indexes. I want to find the sum of all those numbers. What I want to do is basically to use a thread for every single row/index of that dictionary. That single thread will find sum of all the numbers in that specific row and then these sums of all the threads will be summed together to get me the final result.

import random
import time

li = ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u"
, "v", "w", "x", "y"]

arr = {}

for k in range(0, 25):
    arr[li[k]] = [random.randrange(1, 10, 1) for i in range(1000000)]

start = time.perf_counter()

sum = 0
for k, v in arr.items():
    for value in v:
        sum += value 

end = time.perf_counter()

print(sum)

print("Finished in: ", round(end-start, 2), " seconds")

I used to do it the simple way and it took me about 86 seconds in total (due to assigning of numbers to dictionary) and 5 seconds total to calculate the sum.

I want to improve those 5 seconds of the sum calculation by creating thread for every single index of the dictionary. Can anyone help me on this?



Solution 1:[1]

I know...we can call threads on functions.

Nope. You can't call a thread on anything. When you write this:

thread = threading.Thread(foobar, args=(x, y, z))

You're not calling a thread. You are calling the constructor of the Thread class. The constructor makes a new Thread object, and then it's the Thread that does the calling: The Thread calls foobar(x, y, z).

What I want to do is basically to use a thread for every single row/index of that dictionary. That single thread will find sum of all the numbers in that specific row and...

Threads run code, and you have to provide the code that a thread will run in the form of a function. If you wanted a thread to "find the sum of all the numbers in a specific row..,"* then you'd have to write a function that finds the sum of all of the numbers, and then you'd have to create a new Thread that would call your function.


* Some of the other answers and comments on your question explain how Python's Global Interpreter Lock (a.k.a., the GIL) prevents you from using threads to make your program run any faster. So, the rest of this answer is fantasy because it won't make your program faster, but it does illustrate how to create threads.


Probably, you'll want to pass the dictionary and the row number to the function as arguments. Maybe you'll also want to pass it some mutable result structure (e.g., an array) into which the function can save the result.

def FindRowSum(dictionary, row, results):
    sum = 0
    for ...:
        sum = sum + ...
    results[row] = sum

...

allThreads = []
results = []
for row in range(...):
    thread = threading.Thread(FindRowSum, args=(myDictionary, row, results))
    allThreads.append(thread)

Then, further on down, if you want to wait for all of the threads to finish their work:

for thread in allThreads:
    thread.join()

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Solomon Slow