'What to input in Keras.model.fit as Y and how do I fit different dimensions batches of data to Keras model?
I have some data and I try to work with it using Keras model. TLDR
1. What is the purpose of second parameter in model.fit(), and what is the purpose of validation_data parameter? I'm putting my ground_truth in y and I have nothing to put in validation_data, which is causing inability to use callbacks parameter
2. How do I use batches for below data structure?
My data consists of pairs of numpy arrays:
- training_data, shape (x,128)
- ground_truth (probability score), shape (x,)
where x differs between 3000-35000 depends on file, but it's the same for both arrays.
As previous research have found Convolution model fits best for my purpose I've created a model, with inputs (128,128,1) and returning single probability score for each mentioned x
model = keras.Sequential()
model.add(layers.Convolution2D(16, (3, 3),padding='valid',input_shape=(128, 128, 1),strides=2,kernel_initializer='glorot_normal',kernel_regularizer=l2(reg_amount),bias_regularizer=l2(reg_amount)))
model.add(layers.Activation('relu'))
model.add(BatchNormalization())
#Some other layers
model.add(layers.Dense(1,kernel_initializer='glorot_normal',kernel_regularizer=l2(reg_amount),bias_regularizer=l2(reg_amount)))
model.add(layers.Activation('linear'))
model.build()
model.summary()
optimizer = keras.optimizers.SGD(learning_rate=1e-3)
loss_fn = keras.losses.SparseCategoricalCrossentropy(from_logits=True)
model.compile(loss='msle', optimizer=sgd)
Then I tried to fit this model and I was unable to do it using batches. I've put training_data as first parameter and ground_truth as second:
spec_file = np.load ('data.npy')
anno_file = np.load ('ground_truth.npy')
hist = model.fit(spec_file, anno_file,batch_size=128,verbose=0,)
And I'm getting below error message:
ValueError: Data cardinality is ambiguous: x sizes: 128 y sizes: 9801 Make sure all arrays contain the same number of samples.
I tried swapping dimensions, transposing but error message is the same.
I worked around it using for loop taking only one batch from each array, getting (128,128) and (,128) shapes and it caused fit function to work:
for n in range (0,(np.shape(anno_file[0])-128),1):
mel_spect = np.array(spec_file[0:128,n:n+128]) #(128,128)
mel_spect = mel_spect[np.newaxis, ...] #(1,128,128)
ground_truth = np.array(anno_file[n:n+128]) #(128,)
ground_truth = ground_truth[np.newaxis, ... ] #(1,128,)
hist = model.fit(mel_spect, ground_truth, verbose=0 )
losses.append(hist.history['loss'] )
So basically I'm fitting this model thousands of times for every single analyzed file, which I have a feeling, is not right. I should be able to push this data in batches but I don't know how. I also feel my understanding of fit() function is wrong.
Please advise on how will you fit those data into keras because I have no idea at all after days of trying. Thanks a lot.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
