'TensorFlow: Using None-Values in a Tensor

I am trying to use this DSNT-layer from GitHub: https://github.com/ashwhall/dsnt/

It seems that the implementation has a problem with the placeholder consisting of the input size and the batch size. My understanding is that the batch size is usually unknown during graph initialization unless one defines a batch value in the input layer or until the learning process begins.

Based on the batch-size, the dsnt layer creates tensors as seen below:

    batch_count = tf.shape(norm_heatmap)[0]
    height = tf.shape(norm_heatmap)[1]
    width = tf.shape(norm_heatmap)[2]

    # TODO scalars for the new coord system
    scalar_x = ((2 * width) - (width + 1)) / width
    scalar_x = tf.cast(scalar_x, tf.float32)
    scalar_y = ((2 * height) - (height + 1)) / height
    scalar_y = tf.cast(scalar_y, tf.float32)

    # Build the DSNT x, y matrices
    dsnt_x = tf.tile([[(2 * tf.range(1, width+1) - (width + 1)) / width]], [batch_count, height, 1]) # <-- point of error
    dsnt_x = tf.cast(dsnt_x, tf.float32)
    dsnt_y = tf.tile([[(2 * tf.range(1, height+1) - (height + 1)) / height]], [batch_count, width, 1])
    dsnt_y = tf.cast(tf.transpose(dsnt_y, perm=[0, 2, 1]), tf.float32)

When I run this code, I get following error message:

raise e.with_traceback(filtered_tb) from None
ValueError: Shape [1,2,3,4,5,...,64] is too large (more than 2**63 - 1 entries) for '{{node Placeholder}} = Placeholder[dtype=DT_INT32, shape=[1,2,3,4,5,..., 64]]()' with input shapes: .

I found answers in stackoverflow recommending to use tf.shape to avoid problems with handling unknown dimensions. This does not seem to be enough here.

If an input with dimension (none, none, 1) is used, the code is executed. Furthermore, it will be executed when running on Tensorflow 2.5.3 or lower.

My Question: How do I use unknown values that will be defined only when the process of learning started.

I attached a minimal example: Using Python3.10 and Tensorflow2.8

The input is an image of a certain size, e.g. 128x64x1, and the output is the normalized coordinate of the center of mass.

def Minimal_Model(_):
    input_shape = (128, 64, 1)
    X_input = Input(shape=input_shape)
    X_out = Conv2D(filters=1, kernel_size=(1, 1), strides=(1, 1), padding='valid',
                   name="conv", kernel_initializer=he_uniform())(X_input)

    norm_heatmap, coordinates = dsnt.dsnt(X_out)
    model = Model(inputs=X_input, outputs=coordinates, name='Test-DSNT')

    model.compile(optimizer=tensorflow.keras.optimizers.Adam(0.0001),
                  loss=[tf.keras.losses.MeanSquaredError(), tf.keras.losses.MeanSquaredError()],
                  metrics=[tf.keras.metrics.MeanSquaredError()])
    return model

import tensorflow as tf
from models import Minimal_Model
from keras_tuner.tuners import BayesianOptimization
import time

tf.get_logger().setLevel('DEBUG')

MAX_TRIALS = 10
EXECUTION_PER_TRIAL = 1
BATCH_SIZE = 8
EPOCHS = 10
LOG_DIR = 'results-random' + f"{int(time.time())}"

train_images = tf.random.uniform((2000, 64, 128, 1), minval=0, dtype=tf.float32, maxval=1)
test_images = tf.random.uniform((200, 64, 128, 1), minval=0, dtype=tf.float32, maxval=1)
train_labels = tf.random.uniform((2000, 2, 1), minval=0, dtype=tf.float32, maxval=1)
test_labels = tf.random.uniform((200, 2, 1), minval=0, dtype=tf.float32, maxval=1)

tuner = BayesianOptimization(
    Minimal_Model,
    seed=1,
    objective='val_mean_squared_error',
    max_trials=MAX_TRIALS,
    executions_per_trial=EXECUTION_PER_TRIAL,
    directory=LOG_DIR,
    project_name="project"
)

tuner.search(train_images, train_labels, epochs=EPOCHS, batch_size=BATCH_SIZE,
             validation_data=(test_images, test_labels),
             callbacks=[tf.keras.callbacks.EarlyStopping(monitor='val_mean_squared_error', restore_best_weights=True,
                                                         patience=3, mode='min')])

# Show a summary of the search
tuner.results_summary(num_trials=1)


Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source