'Is it possible to split a tensorflow dataset into train, validation AND test datasets when using image_dataset_from_directory?
I am using tf.keras.utils.image_dataset_from_directory to load a dataset of 4575 images. While this function allows to split the data into two subsets (with the validation_split parameter), I want to split it into training, testing, and validation subsets.
I have tried using dataset.skip() and dataset.take() to further split one of the resulting subsets, but these functions return a SkipDataset and a TakeDataset respectively (by the way, contrary to the documentation, where it is claimed that these functions return a Dataset). This leads to problems when fitting the model - the metrics calculated on validation sets (val_loss, val_accuracy) disappear from model history.
So, my question is: is there a way to split a Dataset into three subsets for training, validation and testing, so that all three subsets are also Dataset objects?
Code used to load the data
def load_data_tf(data_path: str, img_shape=(256,256), batch_size: int=8):
train_ds = tf.keras.utils.image_dataset_from_directory(
data_path,
validation_split=0.2,
subset="training",
label_mode='categorical',
seed=123,
image_size=img_shape,
batch_size=batch_size)
val_ds = tf.keras.utils.image_dataset_from_directory(
data_path,
validation_split=0.3,
subset="validation",
label_mode='categorical',
seed=123,
image_size=img_shape,
batch_size=batch_size)
return train_ds, val_ds
train_dataset, test_val_ds = load_data_tf('data_folder', img_shape = (256,256), batch_size=8)
test_dataset = test_val_ds.take(686)
val_dataset = test_val_ds.skip(686)
Model compilation and fitting
model.compile(optimizer='sgd',
loss=tf.keras.losses.CategoricalCrossentropy(from_logits=False),
metrics=['accuracy'])
history = model.fit(train_dataset, epochs=50, validation_data=val_dataset, verbose=1)
When using a normal Dataset, val_accuracy and val_loss are present in the history of the model:
But when using a SkipDataset, they are not:
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|



