'TensorFlow 2 tf.function decorator
I have TensorFlow 2.0 and Python 3.7.5.
I have written the following code for performing mini-batch gradient descent which is:
@tf.function
def train_one_step(model, mask_model, optimizer, x, y):
'''
Function to compute one step of gradient descent optimization
'''
with tf.GradientTape() as tape:
# Make predictions using defined model-
y_pred = model(x)
# Compute loss-
loss = loss_fn(y, y_pred)
# Compute gradients wrt defined loss and weights and biases-
grads = tape.gradient(loss, model.trainable_variables)
# type(grads)
# list
# List to hold element-wise multiplication between-
# computed gradient and masks-
grad_mask_mul = []
# Perform element-wise multiplication between computed gradients and masks-
for grad_layer, mask in zip(grads, mask_model.trainable_weights):
grad_mask_mul.append(tf.math.multiply(grad_layer, mask))
# Apply computed gradients to model's weights and biases-
optimizer.apply_gradients(zip(grad_mask_mul, model.trainable_variables))
# Compute accuracy-
train_loss(loss)
train_accuracy(y, y_pred)
return None
In the code, "mask_model" is a mask which is either 0 or 1. The use of "mask_model" is to control which parameters are trained (since, 0 * gradient descent = 0).
My question is, I am using "grad_mask_mul" list variable inside "train_one_step()" TensorFlow decorated function. Can this cause any problems, such as:
ValueError: tf.function-decorated function tried to create variables on non-first call.
Or do you guys see some problem of using a list variable inside a tensorflow decorated function?
Thanks!
Solution 1:[1]
this is a bug in TensorFlow 2. You can read more about it here TF2 bug
Solution 2:[2]
In case people are still getting the error
ValueError: tf.function-decorated function tried to create variables on non-first call.
but are unsure what is going on. The tensorflow team has updated the "function" guide around Feb 2021 (see https://github.com/tensorflow/tensorflow/issues/36574):
See the updated guide, particularly, see the "creating tf variables" section: https://www.tensorflow.org/guide/function#creating_tfvariables
Basically, what the OP needs to ensure is:
- In the very first call to the function'ized
train_step, alltf.Variable's are created once, and no new tf.Variables are ever created in any of the models or optimizers in a subsequent (i.e. non first) call totrain_one_step.
Likely, you have sent into train_one_step a new unbuilt version of model or mask_model, and tensorflow is trying to build it (i.e. make new tf.Variable's), but train_one_step was already called before as a tf.function.
The current (updated) guide explains how to address such issues.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Arun |
| Solution 2 |
