'Adding a row to a batch of matrices inside a TensorFlow Model

I use a Transformer Model which gives me as an output a 6-elements vector which gives me the transformation to apply to the image I want to process in my model (I reshape it to a 2x3 matrix, which are in reality the first two rows of a 3x3 matrix used to describe a transformation in 3D). The transformation is used to improve the performance of an anomaly detecting model but then the transformation needs to be reverted to have an in-place processing of the image.

I only need the first two rows of the matrices for the transformation as it is a 2D image but to invert the matrices I need to add a [0,0,1] row at the bottom of the matrices to make it square. Since it's all done using the TF API, it has to work for batches and I tried used the tf.map_fn function :

def transform_grids_matrix_inv(transformations, grids, inputs):
    with tf.name_scope("transform_grids"):
        trans_matrices=tf.reshape(transformations, (-1, 2,3))
        row_to_add = [0,0,1]
        appended_trans_matrices = tf.map_fn(lambda x: tf.concat((x, [row_to_add]), axis=0), trans_matrices)
        trans_matrices = tf.linalg.inv(appended_trans_matrices)
        batch_size = tf.shape(trans_matrices)[0]
        gs = tf.squeeze(grids, -1)

        reprojected_grids = tf.matmul(trans_matrices, gs, transpose_b=True)
        # transform grid range from [-1,1) to the range of [0,1)
        reprojected_grids = (tf.linalg.matrix_transpose(reprojected_grids) + 1)*0.5
        _, H, W, _ = inputs.shape
        reprojected_grids = tf.math.multiply(reprojected_grids, [W, H])

        return reprojected_grids  

To transform the image I use grids coordinates which will then be used to resample the image.

When I try to define my model the following way :

img = x_training_half[0]
data_shape = img.shape
H,W,_ = img.shape
inputs = Input(shape=(data_shape))

localization_head = create_localization_head_2(inputs)
x = spatial_transform_input_matrix(inputs, localization_head.output)
x = run_model_AE(x)
x = spatial_transform_input_matrix_inv(x,localization_head.output)

model = tf.keras.Model(inputs = inputs, outputs = x)

model.compile(optimizer="adam", loss = 'binary_crossentropy')

I get the following error message:

TypeError: Could not build a TypeSpec for KerasTensor(type_spec=TensorSpec(shape=(None, 2, 3), dtype=tf.float32, name=None), name='tf.reshape_139/Reshape:0',
description="created by layer 'tf.reshape_139'") of unsupported type <class 'keras.engine.keras_tensor.KerasTensor'>.

Which points back to the map_fn function.

The tf.map_fn works for batches of data I test outside of the model but not inside the definition of the model with the API (here's an example of it working outside the model):

M = np.array([[1,2,3],[4,5,6]])
batch = [M,M]
t = tf.convert_to_tensor(batch, dtype=np.float32)
trans_matrices = tf.map_fn(lambda x: tf.concat((x, [[0,0,1]]), axis=0), t)
print(trans_matrices)

output :

tf.Tensor(
[[[1. 2. 3.]
  [4. 5. 6.]
  [0. 0. 1.]]

 [[1. 2. 3.]
  [4. 5. 6.]
  [0. 0. 1.]]], shape=(2, 3, 3), dtype=float32)

As a workaround to this problem I tried changing my model to give me a 4-elements vector which would contain R, theta, translation_x & translation_y and then use these to fill in the 2x3 matrix using something like that :

R = transformation[0]
Theta = transformation[1]
tx = transformation[2]
ty = transformation[3]
A = R*tf.math.cos(Theta)
B = - R*tf.math.sin(Theta)
C = ty
D = R*tf.math.sin(Theta)
E = R*tf.math.cos(Theta)
F = tx
Transformations = tf.stack((A,B,C,D,E,F),axis = 0)

But it's not ideal as it constrains the transformation somewhat.

I thought of getting a 9-elements vector out of my model to not have to deal with the adding of the row but it would probably decrease the performance of the model as it would essentially try to learn how to fill the bottom row even though it won't be used.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source