'Sigmoid function returns NaN no matter the input
I am trying to use the tf.math.sigmoid() function on a tensor in my code.
I am getting a tensor from the NN (let's call it tensor a) along the lines of:
[[[[-0.8762997 ]
[-0.8903917 ]
[-0.8672142 ]
[-0.8688538 ]
[-0.9621545 ]
[-0.89997154]
[-0.8640675 ]
[-0.93268245]
[-0.8761404 ]
[-0.85574013]
[-0.8742257 ]
[-0.89015967]
[-0.8417985 ]
[-0.8387405 ]
[-0.8407151 ]
[-0.8985772 ]
[-0.8235554 ]
[-0.8261194 ]
[-0.8440901 ]
[-0.9285601 ]
[-0.8315757 ]
[-0.8859896 ]
[-0.94726914]
[-0.8773718 ]
[-0.80427724]
[-0.86090446]
[-0.8038499 ]
[-0.92629945]
[-0.7933995 ]
[-0.7654424 ]]]]
I am then applying a tf.reduce_mean(a, axis=2) yielding:
[[[-0.8665478]]]
of type <class 'tensorflow.python.framework.ops.Tensor'> (let's call this tensor b).
Up to now everything works as expected. However, when I now plug tensor b into the sigmoid function (as such: c = tf.math.sigmoid(b)), tensor c always yields [[[nan]]]!
This happens no matter the values in tensor b. Various values have come out in tensor b (small negative, larger negative, small positive and larger positive).
I've tried various things: squeezing tensor b, recasting it as tf.float32, etc. but the result stays the same!
I have also tried casting a constant and plugging it into a sigmoid, which works fine...
Why is this happening? (Thanks so much already for any help!)
** EDIT ** More of the code:
class Model
def __init__ (self, data)
# < some code > that computes self.a as a tensor of shape 1x1x30x1
self.b = tf.reduce_mean(self.a, axis=2)
self.c = tf.math.sigmoid(self.b)
x = tf.constant([-128.0], dtype=tf.float32)
self.testing = tf.math.sigmoid(x)
if name = '__main__':
# <load some data>
model = Model(loaded_data)
# <compute some cross entropy>
train_step = tf.compat.v1.train.AdamOptimizer(learning_rate).minimize(cross_entropy, global_step=global_step)
with tf.compat.v1.Session() as sess:
sess.run(tf.compat.v1.global_variables_initializer())
sess.run(tf.compat.v1.local_variables_initializer())
[a_out, b_out, c_out, test_out] = sess.run([model.a, model.b, model.c, model.testing])
print(a_out)
print(b_out)
print(c_out)
print(test_out)
The resulting outputs are shown above, except for the test variable, that yields the expected result of:
[0.]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
