'Loop Vectorisation

I need to vectorise the following loop using Numpy for performance purposes:

for example in client_local_dataset:
     X.append(example['image'].numpy())
     Y.append(example['label'].numpy())
x_train = np.array(X)
y_train = np.array(Y)


Solution 1:[1]

Your code example is incomplete as the definition of client_local_dataset is missing. Here is my guess what it is and what you want:

import numpy as np

client_local_dataset = [{"image": "i1", "label": 'a'}, {"image": "i2", "label": 'b'}]

my_vec_fun1 = np.vectorize(lambda elem: elem["image"])
my_vec_fun2 = np.vectorize(lambda elem: elem["label"])

X = my_vec_fun1(client_local_dataset)
Y = my_vec_fun2(client_local_dataset)

print(X, type(X))
print(Y, type(Y))

np.vectorized functions can take the whole list client_local_datasetas an argument. They return a numpy array.

EDIT: Faster would be list comprehension:

X = np.array([elem["image"] for elem in client_local_dataset])

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1