'How does one invert an encoded prediction in Keras for model serving?

I have a Keras model in which i have successfully added a StringLookUp pre-processing step as part of the model definition. This is generally a good practice because i can then feed it the raw data to get back a prediction.

I am feeding the model string words that are mapped to an integer. The Y values are also string words that have been mapped to an integer.

Here is the implementation of the encoder and decoders:

#generate the encoder and decoders
encoder = tf.keras.layers.StringLookup(vocabulary=vocab, )
decoder = tf.keras.layers.StringLookup(vocabulary=vocab, output_mode="int", invert=True)

Here is the some of the code that makes the inference model

# For inference, you can export a model that accepts strings as input
inputs = Input(shape=(6,), dtype="string")
x = encoder(inputs)
outputs = keras_model(x)
inference_model = Model(inputs, outputs)

inference_model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])  
inference_model.summary()

The encoder above is just a function that implements tf.keras.layers.StringLookup

Now, inside the notebook, I can easily convert the predictions back to the Original String representations by using a decoder which implements the reverse of StringLookUp.

Here's my problem While this works fine inside the notebook, this isn't very practical for deploying the model as a REST API because the calling program has no way of knowing how the encoded integer maps back to the original string representation.

So the question is what strategy should I use to implement the keras predict so that it returns the original string which I can then serialize using mlflow & cloudpickle to deploy it as a servable model in databricks

Any guidance would be very much appreciated. I've seen a lot of example of Keras, but none that show how to do enact this kind of behavior for model deployment.



Solution 1:[1]

It is easy and you already know when Python interfaces to many platforms languages they can do it by scripts, execute, or programs interface. You experience using dll or COM interfaces we complied for safe running in the background.

[ Codes ]

#generate the encoder and decoders
encoder = tf.keras.layers.StringLookup(vocabulary=vocab, )
decoder = tf.keras.layers.StringLookup(vocabulary=vocab, output_mode="int", invert=True)

result = encoder(data)
print(result)

data = tf.constant([[1, 3, 4], [4, 0, 2]], dtype=tf.int64)
result = decoder(data)
print(result)

[ Output ]:

# tf.Tensor(
    # [[1 3 4]
    # [4 0 2]], shape=(2, 3), dtype=int64)

# tf.Tensor(
    # [[b'a' b'c' b'd']
    # [b'd' b'[UNK]' b'b']], shape=(2, 3), dtype=string)
    
    # b'[UNK]' ( absents )

... Example

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Martijn Pieters