'is this architecture an autoencoder

I want to create an autoencodre i build this architecture it works but i want to know if it is an autoencoder architecture

Encoder

    layer = layers.Conv2D(16, (3, 3), activation="relu", padding="same",data_format = 'channels_first')(input)
    layer = layers.MaxPooling2D((2, 2), padding="same",data_format = 'channels_first')(layer)
    layer = layers.Conv2D(32, (3, 3), activation="relu", padding="same",data_format = 'channels_first')(layer)
    layer = layers.MaxPooling2D((2, 2), padding="same",data_format = 'channels_first')(layer)

    ## Decoder
    layer = layers.Conv2DTranspose(16, (3, 3), strides=2, activation="relu", padding="same",data_format = 'channels_first')(layer)
    layer = layers.UpSampling2D((2,2))(layer)

    layer = layers.Conv2DTranspose(32, (3, 3), strides=2, activation="relu", padding="same",data_format = 'channels_first')(layer)
    layer = layers.UpSampling2D((2,2))(layer)

    #layer = layers.UpSampling2D((2,2))(layer)

    layer = layers.Flatten()(layer)
    dense = layers.Dense(784, activation="sigmoid")
    output = dense(layer)


Solution 1:[1]

There are some problems in your code:

  1. You need an input layer to your model if you are using functional:

    input = layers.Input(shape=(3, 192, 192))

  2. In an autoencoder, the output of your model needs to have the same dimensions as the input. However, in your model your output is a dense vector (1D), while your input is obviously at least 2D (or 3D if you have channels, like in images).

  3. You have specified the argument data_format = 'channels_first' which means that your input tensor has the channel dimension in the position 0. For example, if your input is an rgb image, it has shape (color_channel, width, height), instead of the more common (width, heigth, color_channel). That is ok, but 1) Make sure your images have channels first and 2) You need to pass the same argument on your upsampling layers.

With a couple of changes, the model looks like this:

## Encoder
input = layers.Input(shape=(3, 192, 192))
layer = layers.Conv2D(16, (3, 3), activation="relu", padding="same",data_format = 'channels_first')(input)
layer = layers.MaxPooling2D((2, 2), padding="same",data_format = 'channels_first')(layer)
layer = layers.Conv2D(32, (3, 3), activation="relu", padding="same",data_format = 'channels_first')(layer)
layer = layers.MaxPooling2D((2, 2), padding="same",data_format = 'channels_first')(layer)

## Decoder
layer = layers.Conv2DTranspose(16, (3, 3), strides=1, activation="relu", padding="same",data_format = 'channels_first')(layer)
layer = layers.UpSampling2D((2,2), data_format='channels_first')(layer)
layer = layers.Conv2DTranspose(32, (3, 3), strides=1, activation="relu", padding="same",data_format = 'channels_first')(layer)
layer = layers.UpSampling2D((2,2), data_format='channels_first')(layer)
output = layers.Conv2DTranspose(3, (3, 3), strides=1, activation="relu", padding="same",data_format = 'channels_first')(layer)

model = tf.keras.Model(inputs=input, outputs=output)
model.summary()

Model: "model_8"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_10 (InputLayer)        [(None, 3, 192, 192)]     0         
_________________________________________________________________
conv2d_19 (Conv2D)           (None, 16, 192, 192)      448       
_________________________________________________________________
max_pooling2d_18 (MaxPooling (None, 16, 96, 96)        0         
_________________________________________________________________
conv2d_20 (Conv2D)           (None, 32, 96, 96)        4640      
_________________________________________________________________
max_pooling2d_19 (MaxPooling (None, 32, 48, 48)        0         
_________________________________________________________________
conv2d_transpose_19 (Conv2DT (None, 16, 48, 48)        4624      
_________________________________________________________________
up_sampling2d_17 (UpSampling (None, 16, 96, 96)        0         
_________________________________________________________________
conv2d_transpose_20 (Conv2DT (None, 32, 96, 96)        4640      
_________________________________________________________________
up_sampling2d_18 (UpSampling (None, 32, 192, 192)      0         
_________________________________________________________________
conv2d_transpose_21 (Conv2DT (None, 3, 192, 192)       867       
=================================================================
Total params: 15,219
Trainable params: 15,219
Non-trainable params: 0

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 LeonardoVaz