'How to convert images into numpy array quickly?
To train the image classification model I'm loading input data as NumPy array, I deal with thousands of images. Currently, I'm looping through each image and converting it into a NumPy array as shown below.
import glob
import cv2
import numpy as np
tem_arr_list = []
from time import time
images_list = glob.glob(r'C:\Datasets\catvsdogs\cat\*.jpg')
start = time()
for idx, image_path in enumerate(images_list):
start = time()
img = cv2.imread(image_path)
temp_arr = np.array(cv2.imread(image_path))
# print(temp_arr.shape)
tem_arr_list.append(temp_arr)
print("Total time taken {}".format (time() - start))
running this method takes a lot of time when data is huge. So I tried using list comprehension as below
tem_arr_list = [np.array(cv2.imread(image_path)) for image_path in images_list]
which is slight quicker than looping but not fastest
I'm looking any other way to reduce the time to do this operation . Any help or suggestion on this will be appreciated.
Solution 1:[1]
Use the multiprocessing pool to load data parallely. In my PC the cpus count is 16. I tried loading 100 images and below you could see the time taken.
import multiprocessing
import cv2
import glob
from time import time
def load_image(image_path):
return cv2.imread(image_path)
if __name__ == '__main__':
image_path_list = glob.glob('*.png')
try:
cpus = multiprocessing.cpu_count()
except NotImplementedError:
cpus = 2 # arbitrary default
pool = multiprocessing.Pool(processes=cpus)
start = time()
images = pool.map(load_image, image_path_list)
print("Total time taken using multiprocessing pool {} seconds".format (time() - start))
images = []
start = time()
for image_path in image_path_list:
images.append(load_image(image_path))
print("Total time taken using for loop {} seconds".format (time() - start))
start = time()
images = [load_image(image_path) for image_path in image_path_list]
print("Total time taken using list comprehension {} seconds".format (time() - start))
Output:
Total time taken using multiprocessing pool 0.2922379970550537 seconds
Total time taken using for loop 1.4935636520385742 seconds
Total time taken using list comprehension 1.4925990104675293 seconds
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
