'Tensorflow TextVectorization: convert the predicted text back to a human readable string

I have developed a model that generates a new world that follows a sequence. The model successfully outputs a prediction, but I'm unable to use it as it is vectorized.

I use tensorflow.keras.layers.TextVectorization to vectorize the text:

text_vectorizer = layers.TextVectorization(
    max_tokens=len_shared_vocabulary,
    output_mode="int",
    output_sequence_length=max_inputs_length,
    vocabulary=shared_vocabulary,
)

Then I do:

vectorized_text = text_vectorizer(text)

And I feed the data to the model:

input = Input(shape=(max_inputs_length), name='input')
inputs_arr.append(input)

# ...layers

concatenated = layers.concatenate(layers_arr, axis=-1)

output_tensor = layers.Dense(1, activation='softsign')(concatenated)

model = Model(inputs_arr, output_tensor)

Now to make a prediction I embed the text vectorization layer in a "new" model:

input = keras.Input(shape=(1,), dtype="string")
processed_input = self._text_vectorizer(input)
outputs = model(processed_input)
inference_model = keras.Model(input, outputs)

And finally:

raw_input = tf.convert_to_tensor([
    ["Some text to run the prediction on"],
])
predictions = inference_model(raw_input)
print(predictions[0])

Now, how do I convert the vectorized output in predictions[0] to a readable string?



Solution 1:[1]

You can do something like

vocab = text_vectorizer.get_vocabulary()
" ".join([vocab[each] for each in tf.squeeze(predictions)])

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Ananay Mital