'Why is the tensorflow maxout not calculating the gradient respectively where is the mistake?
I am trying to use the tensorflow maxout implementation (https://www.tensorflow.org/addons/api_docs/python/tfa/layers/Maxout) but struggle with it;
I try to illustrate my problem: If I have the following
d=3
x_in=Input(shape=d)
x_out=Dense(d, activation='relu')(x_in)
model = Model(inputs=x_in, outputs=x_out)
model.compile(optimizer='adam', loss='MeanAbsoluteError')
X=tf.random.normal([200,3])
Y=tf.random.normal([200,3])
model.fit(X, Y, epochs=5, batch_size=32)
Then it is working normally, i.e. the loss is continuously getting smaller and I can get the estimated weights:
model.layers[1].get_weights()
Out[141]:
[array([[-0.15133516, -0.14892222, -0.64674205],
[ 0.34437487, 0.7822309 , -0.08931279],
[-0.8330534 , -0.13827904, -0.23096593]], dtype=float32),
array([-0.03069788, -0.03311999, -0.02603031], dtype=float32)]
However, when I want to use a maxout activation instead, things do not work out
d=3
x_in=Input(shape=d)
x_out = tfa.layers.Maxout(3)(x_in)
model = Model(inputs=x_in, outputs=x_out)
model.compile(optimizer='adam', loss='MeanAbsoluteError')
X=tf.random.normal([200,3])
Y=tf.random.normal([200,3])
model.fit(X, Y, epochs=5, batch_size=32)
The loss stays constant for all Epochs and
model.layers[1].get_weights()
Out[141]: []
Where is my mistake?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
