'What to input in Keras.model.fit as Y and how do I fit different dimensions batches of data to Keras model?

I have some data and I try to work with it using Keras model. TLDR

1. What is the purpose of second parameter in model.fit(), and what is the purpose of validation_data parameter? I'm putting my ground_truth in y and I have nothing to put in validation_data, which is causing inability to use callbacks parameter

2. How do I use batches for below data structure?

My data consists of pairs of numpy arrays:

  • training_data, shape (x,128)
  • ground_truth (probability score), shape (x,)

where x differs between 3000-35000 depends on file, but it's the same for both arrays.

As previous research have found Convolution model fits best for my purpose I've created a model, with inputs (128,128,1) and returning single probability score for each mentioned x

model = keras.Sequential()
model.add(layers.Convolution2D(16, (3, 3),padding='valid',input_shape=(128, 128, 1),strides=2,kernel_initializer='glorot_normal',kernel_regularizer=l2(reg_amount),bias_regularizer=l2(reg_amount)))
model.add(layers.Activation('relu'))
model.add(BatchNormalization())
       #Some other layers
model.add(layers.Dense(1,kernel_initializer='glorot_normal',kernel_regularizer=l2(reg_amount),bias_regularizer=l2(reg_amount)))
model.add(layers.Activation('linear'))

model.build()
model.summary()

optimizer = keras.optimizers.SGD(learning_rate=1e-3)
loss_fn = keras.losses.SparseCategoricalCrossentropy(from_logits=True)
model.compile(loss='msle', optimizer=sgd)

Then I tried to fit this model and I was unable to do it using batches. I've put training_data as first parameter and ground_truth as second:

    spec_file = np.load ('data.npy')              
    anno_file = np.load ('ground_truth.npy')

    hist = model.fit(spec_file, anno_file,batch_size=128,verbose=0,)

And I'm getting below error message:

ValueError: Data cardinality is ambiguous: x sizes: 128 y sizes: 9801 Make sure all arrays contain the same number of samples.

I tried swapping dimensions, transposing but error message is the same.

I worked around it using for loop taking only one batch from each array, getting (128,128) and (,128) shapes and it caused fit function to work:

for n in range (0,(np.shape(anno_file[0])-128),1):
        mel_spect = np.array(spec_file[0:128,n:n+128])      #(128,128)
        mel_spect = mel_spect[np.newaxis, ...]              #(1,128,128)

        ground_truth = np.array(anno_file[n:n+128])         #(128,)
        ground_truth = ground_truth[np.newaxis, ... ]       #(1,128,)

        hist = model.fit(mel_spect, ground_truth, verbose=0 ) 
        losses.append(hist.history['loss'] )

So basically I'm fitting this model thousands of times for every single analyzed file, which I have a feeling, is not right. I should be able to push this data in batches but I don't know how. I also feel my understanding of fit() function is wrong.

Please advise on how will you fit those data into keras because I have no idea at all after days of trying. Thanks a lot.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source