'How to assign a sequential number to ThreadPoolExecutor worker
assuming the following code example where a list of files getting uploaded, using ThreadPoolExecutor as executor.
def upload_segments(segment_upload_list):
def __upload(object_path_pair):
libera_resource.upload_file(*object_path_pair)
print("Segment uploaded!")
with ThreadPoolExecutor() as executor:
executor.map(__upload, segment_upload_list)
upload_segments(segment_upload_list)
How can I assign a number to the point where I'm in my list in this multithreaded scenario? I want to display something like this in the end:
"Segment 10/4310 uploaded."
I'm aware that the output cannot be sequential here due to the nature of multithreading, but it would at least provide a progress overview. I guess counting the number of threads I've already started would also do the trick here to count the number of segments already uploaded
Thanks in advance
Solution 1:[1]
Here's an illustration of how to use a global counter like I suggested in a comment. Each upload is being simulated by having each worker thread sleep for a random amount of time.
from concurrent.futures import ThreadPoolExecutor
from time import sleep
import threading
import random
count_lock = threading.Lock()
count = 0
def upload_file(segment):
global count
print(f'Uploading segment #{segment}...')
sleep(random.uniform(1, 5)) # Simulate variable-length upload.
with count_lock:
count += 1
print(f' {count} of {len(segment_upload_list)} uploaded.')
def upload_segments(segment_upload_list):
global count
count = 0
with ThreadPoolExecutor(max_workers=3) as executor:
executor.map(upload_file, segment_upload_list)
print('\nAll segments uploaded!')
segment_upload_list = list(range(1, 11))
upload_segments(segment_upload_list)
Sample output. Note how it immediately starts 3 upload task threads, and then starts another one everytime one finishes. This is because of the limit specified when the ThreadPoolExecutor was created. As you can see it doesn't really matter which one of the three performed that task as I said in my even earlier comment.
Uploading segment #1...
Uploading segment #2...
Uploading segment #3...
1 of 10 uploaded.
Uploading segment #4...
2 of 10 uploaded.
Uploading segment #5...
3 of 10 uploaded.
Uploading segment #6...
4 of 10 uploaded.
Uploading segment #7...
5 of 10 uploaded.
Uploading segment #8...
6 of 10 uploaded.
Uploading segment #9...
7 of 10 uploaded.
Uploading segment #10...
8 of 10 uploaded.
9 of 10 uploaded.
10 of 10 uploaded.
All segments uploaded!
Solution 2:[2]
My final solution:
count_lock = threading.Lock()
count = 0
def __upload(object_path_pair):
global count
if count > 0:
print(f'Uploading segment #{count} to CDN.')
libera_resource.upload_file(*object_path_pair)
sleep(random.uniform(1, 5))
with count_lock:
if count > 0:
print(f' Segment {count} of {len(segment_upload_list)} uploaded successfully.')
count += 1
def upload_segments(segment_upload_list):
global count
count = 0
with ThreadPoolExecutor() as executor:
executor.map(__upload, segment_upload_list)
print('\n!!! ALL SEGMENTS UPLOADED !!!')
upload_segments(segment_upload_list)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 |
