'ValueError: Unable to create dataset (name already exists) while saving tensorflow model
I am trying to save the trained model below.
resnet = ResNet50V2(input_shape=(im_size,im_size,3), weights='imagenet', include_top=False)
headModel = AvgPool2D(pool_size=(3,3))(resnet.output)
headModel = Flatten(name="flatten")(headModel)
headModel = Dense(256, activation="relu")(headModel)
headModel = Dropout(0.5)(headModel)
headModel = Dense(1, activation="sigmoid")(headModel)
resnet50v2 = Model(inputs=resnet.input, outputs=headModel)
resnet50v2.compile(loss='binary_crossentropy', optimizer=opt, metrics=METRICS)
history = resnet50v2.fit(
datagen.flow(X_train, y_train, batch_size=32, subset='training'),
batch_size=batch_size,
epochs=150,
steps_per_epoch=steps_per_epoch,
validation_data=datagen.flow(X_train, y_train, batch_size=8, subset='validation'))
However, whenever I try to save it with the following command:
resnet50v2.save('Saved_Models/resnet50.h5', save_format='h5')
I get the error
ValueError Traceback (most recent call last)
/tmp/ipykernel_3252071/2034094124.py in <module>
----> 1 resnet50v2.save('Saved_Models/resnet50.h5', save_format='h5')
~/.local/lib/python3.8/site-packages/keras/utils/traceback_utils.py in error_handler(*args, **kwargs)
65 except Exception as e: # pylint: disable=broad-except
66 filtered_tb = _process_traceback_frames(e.__traceback__)
---> 67 raise e.with_traceback(filtered_tb) from None
68 finally:
69 del filtered_tb
~/.local/lib/python3.8/site-packages/h5py/_hl/group.py in create_dataset(self, name, shape, dtype, data, **kwds)
147 group = self.require_group(parent_path)
148
--> 149 dsid = dataset.make_new_dset(group, shape, dtype, data, name, **kwds)
150 dset = dataset.Dataset(dsid)
151 return dset
~/.local/lib/python3.8/site-packages/h5py/_hl/dataset.py in make_new_dset(parent, shape, dtype, data, name, chunks, compression, shuffle, fletcher32, maxshape, compression_opts, fillvalue, scaleoffset, track_times, external, track_order, dcpl, allow_unknown_filter)
140
141
--> 142 dset_id = h5d.create(parent.id, name, tid, sid, dcpl=dcpl)
143
144 if (data is not None) and (not isinstance(data, Empty)):
h5py/_objects.pyx in h5py._objects.with_phil.wrapper()
h5py/_objects.pyx in h5py._objects.with_phil.wrapper()
h5py/h5d.pyx in h5py.h5d.create()
ValueError: Unable to create dataset (name already exists)
How can I save my models?
Solution 1:[1]
Here is an example that seems to work:
import tensorflow as tf
resnet = tf.keras.applications.ResNet50V2(input_shape=(224, 224, 3), weights='imagenet', include_top=False)
headModel = tf.keras.layers.AvgPool2D(pool_size=(3,3))(resnet.output)
headModel = tf.keras.layers.Flatten(name="flatten")(headModel)
headModel = tf.keras.layers.Dense(256, activation="relu")(headModel)
headModel = tf.keras.layers.Dropout(0.5)(headModel)
headModel = tf.keras.layers.Dense(1, activation="sigmoid")(headModel)
resnet50v2 = tf.keras.Model(inputs=resnet.input, outputs=headModel)
resnet50v2.compile(loss='binary_crossentropy', optimizer='adam')
x = tf.random.normal((20, 224, 224, 3))
y = tf.random.uniform((20, 1), maxval=2, dtype=tf.int32)
resnet50v2.fit(x, y, batch_size=2, epochs=2)
tf.saved_model.save(resnet50v2, 'saved_model') ```
Solution 2:[2]
From what I'm reading, pyspark DF's do not have an index by default. You might need to add this.
I do not know the exact syntax for pyspark, however since it has many similarities with pandas this might lead you into a certain direction:
df.loc[df.reRnk == 'yes', ['val','id']] = df.loc[df.reRnk == 'yes', ['val','id']].sort_values('val', ascending=False).set_index(df.loc[df.reRnk == 'yes', ['val','id']].index)
Basically what we do here is isolating the rows with reRnk == 'yes', sorting these values but resetting the index to its original index. Then we assign these new values to the original rows in the df.
for .loc, https://spark.apache.org/docs/3.2.0/api/python/reference/pyspark.pandas/api/pyspark.pandas.DataFrame.loc.html might be worth a try.
for .sort_values see: https://sparkbyexamples.com/pyspark/pyspark-orderby-and-sort-explained/
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | Paul |
