'Can Python MultiThreading corrupt files?
I am using python mutlithreading to process images in the following way:
def run_single_image(self,row):
filename=row["filename"]
image = cv2.imread(filename)
new_image = self.image_processor(image)
cv2.imwrite(new_image_path, new_image)
row is a row in from pandas data frame. To call this function I am using the following function:
def run_multi(self) -> None:
ex_list = []
print("running")
with concurrent.futures.ThreadPoolExecutor(max_workers=16) as executor:
for _, row in self.df.iterrows():
ex_list.append(executor.submit(self.run_single_image, row))
executor.shutdown(wait=True)
However, when I am reading the processed images, some of the processed images are corrupted and I get cv2 error. Can Multithreading corrupt those files ?
Solution 1:[1]
Well, after examining the dataframe again, there are rows that refer to the same filenames. Because new_image_path is composed of the old filename, in some cases, two threads wrote to the same to the same file resulting in half image processed as a function of the first row and a half image processed as a result of the second row.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | s.b |
