'tf.data How to npy load data for training?
I am trying to use tf.data.Dataset on my downloaded dataset made over 1mln audio spectrogram that I have already computed and stored on my HardDisk as npy files. I have stored all the paths on a CSV dataframe and I am trying to use tf.data:
data_selected=pd.read_csv(data_path)
filename = data_selected['filename']
idx_frame=data_selected['index_frame']
labels_train=data_selected['class']
Then i use from_tensor_slices:
dataset = tf.data.Dataset.from_tensor_slices((filename, we_objects, idx_frame)).shuffle(len(spect_names))
From it I created a parse function to load:
def parse_function(filename, label, idx_frame):
Sxx = np.load( filename, allow_pickle = True)
Sxx = Sxx[idx_frame].reshape(Sxx.shape[1],Sxx.shape[2],1)
return Sxx, label
But when I do:
dataset = dataset.map(parse_function, num_parallel_calls= 'auto')
I got error that filename is a tensor and not a string... So I have used py_func:
dataset = dataset.map(lambda item, lab, idx: tf.numpy_function(
parse_function, [item, lab, idx], [tf.float32, tf.float32, tf.int32],
num_parallel_calls=tf.data.AUTOTUNE)
dataset.batch(batch_size)
dataset = dataset.prefetch(1)
options = tf.data.Options()
from tensorflow.data.experimental import AutoShardPolicy
options.experimental_distribute.auto_shard_policy = AutoShardPolicy.OFF
dataset = dataset.with_options(options)
dataset = dataset.cache()
And no errors, but then I do:
history=model.fit(dataset,
initial_epoch=i, epochs=end_epoch,
callbacks=[tbCallBack,model_checkpoint])
And I got errors:
InvalidArgumentError: 3 root error(s) found.
(0) Invalid argument: pyfunc_14 returns 2 values, but expects to see 3 values.
[[{{node PyFunc}}]]
[[MultiDeviceIteratorGetNextFromShard]]
[[RemoteCall]]
[[IteratorGetNextAsOptional]]
[[div_no_nan/ReadVariableOp/_270]]
(1) Invalid argument: pyfunc_14 returns 2 values, but expects to see 3 values.
[[{{node PyFunc}}]]
[[MultiDeviceIteratorGetNextFromShard]]
[[RemoteCall]]
[[IteratorGetNextAsOptional]]
[[replica_1/angular_distance/weighted_loss/cond/switch_pred/_149/_88]]
(2) Invalid argument: pyfunc_14 returns 2 values, but expects to see 3 values.
[[{{node PyFunc}}]]
[[MultiDeviceIteratorGetNextFromShard]]
[[RemoteCall]]
[[IteratorGetNextAsOptional]]
0 successful operations.
0 derived errors ignored. [Op:__inference_train_function_51059]
Function call stack:
train_function -> train_function -> train_function
Where is the error? Is it the right way to do?
Thank you!
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
