'TensorFlow 2 tf.function decorator

I have TensorFlow 2.0 and Python 3.7.5.

I have written the following code for performing mini-batch gradient descent which is:

@tf.function
def train_one_step(model, mask_model, optimizer, x, y):
    '''
    Function to compute one step of gradient descent optimization
    '''
    with tf.GradientTape() as tape:
        # Make predictions using defined model-
        y_pred = model(x)

        # Compute loss-
        loss = loss_fn(y, y_pred)

    # Compute gradients wrt defined loss and weights and biases-
    grads = tape.gradient(loss, model.trainable_variables)

    # type(grads)
    # list

    # List to hold element-wise multiplication between-
    # computed gradient and masks-
    grad_mask_mul = []

    # Perform element-wise multiplication between computed gradients and masks-
    for grad_layer, mask in zip(grads, mask_model.trainable_weights):
        grad_mask_mul.append(tf.math.multiply(grad_layer, mask))

    # Apply computed gradients to model's weights and biases-
    optimizer.apply_gradients(zip(grad_mask_mul, model.trainable_variables))

    # Compute accuracy-
    train_loss(loss)
    train_accuracy(y, y_pred)

    return None

In the code, "mask_model" is a mask which is either 0 or 1. The use of "mask_model" is to control which parameters are trained (since, 0 * gradient descent = 0).

My question is, I am using "grad_mask_mul" list variable inside "train_one_step()" TensorFlow decorated function. Can this cause any problems, such as:

ValueError: tf.function-decorated function tried to create variables on non-first call.

Or do you guys see some problem of using a list variable inside a tensorflow decorated function?

Thanks!



Solution 1:[1]

this is a bug in TensorFlow 2. You can read more about it here TF2 bug

Solution 2:[2]

In case people are still getting the error

ValueError: tf.function-decorated function tried to create variables on non-first call.

but are unsure what is going on. The tensorflow team has updated the "function" guide around Feb 2021 (see https://github.com/tensorflow/tensorflow/issues/36574):

See the updated guide, particularly, see the "creating tf variables" section: https://www.tensorflow.org/guide/function#creating_tfvariables

Basically, what the OP needs to ensure is:

  • In the very first call to the function'ized train_step, all tf.Variable's are created once, and no new tf.Variables are ever created in any of the models or optimizers in a subsequent (i.e. non first) call to train_one_step.

Likely, you have sent into train_one_step a new unbuilt version of model or mask_model, and tensorflow is trying to build it (i.e. make new tf.Variable's), but train_one_step was already called before as a tf.function.

The current (updated) guide explains how to address such issues.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Arun
Solution 2