'GradientTape returning None when run in a loop
The following gradient descent is failing 'coz the gradients returned by tape.gradient() are none when the loop runs second time.
w = tf.Variable(tf.random.normal((3, 2)), name='w')
b = tf.Variable(tf.zeros(2, dtype=tf.float32), name='b')
x = tf.constant([[1., 2., 3.]])
for i in range(10):
print("iter {}".format(i))
with tf.GradientTape() as tape:
#forward prop
y = x @ w + b
loss = tf.reduce_mean(y**2)
print("loss is \n{}".format(loss))
print("output- y is \n{}".format(y))
#vars getting dropped after couple of iterations
print(tape.watched_variables())
#get the gradients to minimize the loss
dl_dw, dl_db = tape.gradient(loss,[w,b])
#descend the gradients
w = w.assign_sub(0.001*dl_dw)
b = b.assign_sub(0.001*dl_db)
iter 0
loss is
23.328645706176758
output- y is
[[ 6.8125362 -0.49663293]]
(<tf.Variable 'w:0' shape=(3, 2) dtype=float32, numpy=
array([[-1.3461215 , 0.43708783],
[ 1.5931423 , 0.31951016],
[ 1.6574576 , -0.52424705]], dtype=float32)>, <tf.Variable 'b:0' shape=(2,) dtype=float32, numpy=array([0., 0.], dtype=float32)>)
iter 1
loss is
22.634033203125
output- y is
[[ 6.7103477 -0.48918355]]
()
TypeError Traceback (most recent call last)
c:\projects\pyspace\mltest\test.ipynb Cell 7' in <cell line: 1>()
11 dl_dw, dl_db = tape.gradient(loss,[w,b])
13 #descend the gradients
---> 14 w = w.assign_sub(0.001*dl_dw)
15 b = b.assign_sub(0.001*dl_db)
TypeError: unsupported operand type(s) for *: 'float' and 'NoneType'
I checked the documentation which explains the possibilities of the gradients becoming None, but none of them are helping.
Solution 1:[1]
This is because assign_sub returns a Tensor. In the line w = w.assign_sub(0.001*dl_dw) you are thus overwriting w with a tensor with the new value. Thus, in the next step, it is not a Variable anymore and is not tracked by the gradient tape by default. This results in the gradient becoming None (tensors also do not have the assign_sub method, so that would crash as well).
Instead, simply write w.assign_sub(0.001*dl_dw) and same for b. The assign functions work in place, so no assignment is necessary.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | xdurch0 |
