'run_eagerly=True make the training result different in Tensorflow 2.3.2

Recently I come across a strange question in Running Neural network code on Tensorflow 2.3.2. The question is that when I only changed run_eagerly=True to run_eagerly=False in the config

    model.compile(
        loss={"label": "binary_crossentropy"},
        metrics=tf.keras.metrics.AUC(name="auc"),
        run_eagerly=True
    )

the model will get quite different results, Mainly about AUC and LOSS changes the AUC and Loss in run_eagerly=True is：

INFO:root:batch (99), speed (67.83 qps/s), loss (7.76), auc (0.50)
INFO:root:batch (199), speed (77.42 qps/s), loss (7.70), auc (0.50)
INFO:root:batch (299), speed (77.81 qps/s), loss (7.69), auc (0.50)
INFO:root:batch (399), speed (75.01 qps/s), loss (7.64), auc (0.50)
INFO:root:batch (499), speed (70.51 qps/s), loss (7.68), auc (0.50)
INFO:root:batch (599), speed (77.87 qps/s), loss (7.70), auc (0.50)
INFO:root:batch (699), speed (75.42 qps/s), loss (7.70), auc (0.50)

while I change the config to run_eagerly=True the result will be:

INFO:root:batch (199), speed (107.17 qps/s), loss (1.12), auc (0.51)
INFO:root:batch (299), speed (100.84 qps/s), loss (1.00), auc (0.52)
INFO:root:batch (399), speed (98.40 qps/s), loss (0.93), auc (0.53)
INFO:root:batch (499), speed (101.34 qps/s), loss (0.89), auc (0.55)
INFO:root:batch (599), speed (102.09 qps/s), loss (0.86), auc (0.56)
INFO:root:batch (699), speed (94.13 qps/s), loss (0.83), auc (0.57)

My model is defined as follows:

 import logging
import tensorflow as tf
from models.core.fully_connected_layers import MultiHeadLayer, get_fc_layers
from .base_net import BaseNet


class SingleTowerNet(BaseNet):

def __init__(self, features, net_conf, **kwargs):
    super(SingleTowerNet, self).__init__(**kwargs)
    self.features = features
    self.net_conf = net_conf
    self.user_hidden_num_list = [int(i) for i in self.net_conf["User"].split(",")]
    self.train_features = features.train_feature_names
    self.user_feature_names = features.user_feature_names
    self.label_name = features.label_feature_names[0]
    assert len(features.label_feature_names) == 1, "must have only one label name"
    self.preprocess_layers = self.get_deep_process_layer_map(self.train_features, features)
    self.user_fc_layers = get_fc_layers(self.user_hidden_num_list)


def get_user_embedding(self, user_features):
    user_emb_list = [self.preprocess_layers[name](user_features[name]) for name in
                     self.user_feature_names]
    user_concat = tf.concat(user_emb_list, axis=1)
    user_embedding = self.user_fc_layers(user_concat)
    user_embedding_norm = tf.math.l2_normalize(user_embedding, axis=1)
    return user_embedding_norm

def call(self, train_data, training=False):
    user_feature = {name: train_data[name] for name in self.user_feature_names}
    user_embedding_norm = self.get_user_embedding(user_feature)

    d = tf.math.reduce_sum(user_embedding_norm * 10, axis=1)
    p = tf.math.sigmoid(d)
    if training:
        return {self.label_name: p}
    else:
        label = tf.squeeze(train_data[self.label_name])
        ad_group_id = tf.squeeze(train_data["ad_group_id"])
        return {"predict": p,
                "label": label,
                "ad_group_id": ad_group_id}

Does anyone what happen ?

tensorflow2.x eager-execution

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'run_eagerly=True make the training result different in Tensorflow 2.3.2

Sources

Related Questions