'Activation function of tf.math.pow(x, 0.5) leading to NaN losses
I'm trying to use a custom square root activation function for my Keras sequential model (specifically for the MNIST dataset). When I use tf.math.sqrt(x), training goes smoothly and the model is quite accurate. However, when I try using tf.math.pow(x, 0.5), the model fails to train and the losses go to NaN.
I'm really unsure why this is happening because I would think that the two alternatives are identical.
Square root function
def tfsqrt(x):
cond = tf.greater_equal(x, 0)
return tf.where(cond, tf.math.sqrt(x), -tf.math.sqrt(-x))
Power function
def pwsqrt(x):
cond = tf.greater_equal(x, 0)
return tf.where(cond, tf.math.pow(x, 0.5), -tf.math.pow(-x, 0.5))
If anybody could explain this unexpected behavior, that would be much appreciated. Thanks!
Solution 1:[1]
Functions are correct: x=tf.Variable([-2.0,-3.0,0.0, 1.0,2.0])
y=tfsqrt(x)
y
y=pwsqrt(x)
y
The functions work fine in google colab, maybe there are some nan values in data.
Maybe there is some problem in model loss or metric.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Peter Pirog |
