'Keras Dense Model ValueError: logits and labels must have the same shape ((None, 200, 1) vs (None, 1, 1))

I'm new in machine learning and I'm trying to train a model. I'm using this Keras oficial example as a guide to set my dataset and feed it into the model: https://www.tensorflow.org/api_docs/python/tf/keras/utils/Sequence

From the training data I have an sliding windows created for a single column and for the labels I have a binary classification (1 or 0).

This is the model creation code:

n = 200
hidden_units = n
dense_model = Sequential()
dense_model.add(Dense(hidden_units, input_shape=([200,1])))
dense_model.add(Activation('relu'))
dense_model.add(Dropout(dropout))
print(hidden_units)

while hidden_units > 2:
    hidden_units = math.ceil(hidden_units/2)
    dense_model.add(Dense(hidden_units))
    dense_model.add(Activation('relu'))
    dense_model.add(Dropout(dropout))
    print(hidden_units)
dense_model.add(Dense(units = 1, activation='sigmoid'))

This is the functions I'm using to compile the model:

def compile_and_fit(model, window, epochs, patience=2):
    early_stopping = tf.keras.callbacks.EarlyStopping(monitor='val_loss',
                                                      patience=patience,
                                                      mode='min')
    model.compile(loss='binary_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])
    model.fit(window.train , epochs=epochs)
    return history

This is the model training:

break_batchs = find_gaps(df_train, 'date_diff', diff_int_value)
for keys, values in break_batchs.items():
    dense_window = WindowGenerator(data=df_train['price_var'],
                                   data_validation=df_validation['price_var'],
                                   data_test=df_test['price_var'],
                                   input_width=n,
                                   shift=m,
                                   start_index=values[0],
                                   end_index=values[1], 
                                   class_labels=y_buy,
                                   class_labels_train=y_buy_train,
                                   class_labels_test=y_buy_test,
                                   label_width=1,
                                   label_columns=None,
                                   classification=True,
                                   batch_size=batch_size,
                                   seed=None)
    history = compile_and_fit(dense_model, dense_window)

and those are the shapes of the batches:

(TensorSpec(shape=(None, 200, 1), dtype=tf.float32, name=None), TensorSpec(shape=(None, 1, 1), dtype=tf.float64, name=None))

The problem is (I guess) that, from the model summary the model is training from the last dimension when it should be working in the second one:

dense_model.summary()

Model: "sequential_21"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
                                          |Model is being applied here
                                          |
                                          v
dense_232 (Dense)            (None, 200, 200)          400       
_________________________________________________________________
                                     |When it should be applied here
                                     |
                                     v
activation_225 (Activation)  (None, 200, 200)          0         
_________________________________________________________________
dropout_211 (Dropout)        (None, 200, 200)          0         
_________________________________________________________________
dense_233 (Dense)            (None, 200, 100)          20100     
_________________________________________________________________
activation_226 (Activation)  (None, 200, 100)          0         
_________________________________________________________________
dropout_212 (Dropout)        (None, 200, 100)          0         
_________________________________________________________________
dense_234 (Dense)            (None, 200, 50)           5050      
_________________________________________________________________
activation_227 (Activation)  (None, 200, 50)           0         
_________________________________________________________________
dropout_213 (Dropout)        (None, 200, 50)           0         
_________________________________________________________________
dense_235 (Dense)            (None, 200, 25)           1275      
_________________________________________________________________
activation_228 (Activation)  (None, 200, 25)           0         
_________________________________________________________________
dropout_214 (Dropout)        (None, 200, 25)           0         
_________________________________________________________________
dense_236 (Dense)            (None, 200, 13)           338       
_________________________________________________________________
activation_229 (Activation)  (None, 200, 13)           0         
_________________________________________________________________
dropout_215 (Dropout)        (None, 200, 13)           0         
_________________________________________________________________
dense_237 (Dense)            (None, 200, 7)            98        
_________________________________________________________________
activation_230 (Activation)  (None, 200, 7)            0         
_________________________________________________________________
dropout_216 (Dropout)        (None, 200, 7)            0         
_________________________________________________________________
dense_238 (Dense)            (None, 200, 4)            32        
_________________________________________________________________
activation_231 (Activation)  (None, 200, 4)            0         
_________________________________________________________________
dropout_217 (Dropout)        (None, 200, 4)            0         
_________________________________________________________________
dense_239 (Dense)            (None, 200, 2)            10        
_________________________________________________________________
activation_232 (Activation)  (None, 200, 2)            0         
_________________________________________________________________
dropout_218 (Dropout)        (None, 200, 2)            0         
_________________________________________________________________
dense_240 (Dense)            (None, 200, 1)            3         
=================================================================
Total params: 27,306
Trainable params: 27,306
Non-trainable params: 0
_________________________________________________________________


And because of that Im getting ValueError: logits and labels must have the same shape ((None, 200, 1) vs (None, 1, 1))

How can I tell Keras to apply the training in the second dimension and not the last one?

EDIT

Diagram explanation

This is what I understand is happening, is this right? How I fixed it?

Edit 2

I tried to modify as suggested, using:

dense_model.add(Dense(hidden_units, input_shape=(None,200,1)))

but I'm getting the following warning:

WARNING:tensorflow:Model was constructed with shape (None, None, 200, 1) for input KerasTensor(type_spec=TensorSpec(shape=(None, None, 200, 1), dtype=tf.float32, name='dense_315_input'), name='dense_315_input', description="created by layer 'dense_315_input'"), but it was called on an input with incompatible shape (None, 200, 1, 1).


Solution 1:[1]

The first dimension that you are pointing at is batch size, as you specified in your input layer (the input shape is [batch_size, input_dim] as can be seen here

dense_model.add(Dense(hidden_units, input_shape=([200,1])))

So your model is outputting 200 values because your batch size is 200, but the target label you are comparing only has one value.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 roman_ka