'Clarification on Tensorflow 2.0 Masking
From the Tensorflow documentation when using Keras subclassing API, they give this example on how to pass a mask along to other layers that implement masking. I am wondering if this is explicitly required or if it is handled correctly after the Embedding layer has mask_zero=True.
class MyLayer(layers.Layer):
def __init__(self, **kwargs):
super(MyLayer, self).__init__(**kwargs)
self.embedding = layers.Embedding(input_dim=5000, output_dim=16, mask_zero=True)
self.lstm = layers.LSTM(32)
def call(self, inputs):
x = self.embedding(inputs)
# Note that you could also prepare a `mask` tensor manually.
# It only needs to be a boolean tensor
# with the right shape, i.e. (batch_size, timesteps).
mask = self.embedding.compute_mask(inputs)
output = self.lstm(x, mask=mask) # The layer will ignore the masked values
return output
layer = MyLayer()
x = np.random.random((32, 10)) * 100
x = x.astype('int32')
layer(x)
My confusion comes from another area of the documentation which states:
Masking
This layer supports masking for input data with a variable number of timesteps. To introduce masks to your data, use an Embedding layer with the mask_zero parameter set to True.
Which seems to mean that if mask_zero=True no further commands need to be done on subsequent layers.
Solution 1:[1]
If you read about the Masking layer, it will also support that once you used the mask at the beginning, all the rest of the layers get the mask automatically.
Quote:
For each timestep in the input tensor (dimension #1 in the tensor), if all values in the input tensor at that timestep are equal to mask_value, then the timestep will be masked (skipped) in all downstream layers (as long as they support masking).
If any downstream layer does not support masking yet receives such an input mask, an exception will be raised.
This other link also states the same. The mask will be propagated to all layers.
Quote:
When using the Functional API or the Sequential API, a mask generated by an Embedding or Masking layer will be propagated through the network for any layer that is capable of using them (for example, RNN layers). Keras will automatically fetch the mask corresponding to an input and pass it to any layer that knows how to use it.
The second link is really full of details on masking.
Notice that the code you showed is for a custom embedding. If teaches you how to "create and pass" a mask, if you want to create a layer that will create a mask. It's basically showing what the normal Embedding layer does.
So, we can conclude that if you're using a normal Embedding layer, all you need is mask_zero=True and everything will go down the stream.
Solution 2:[2]
In addition to the high-level answer given, let's have a look at some important technical details.
In case of doubts inspect the masking source code, to understand how it works.
- Masking adds a
_keras_maskattribute to the tensor, which flags entries to be skipped, effectively letting other API methods know about it. - Test yourself if a layer supports the mask, via
supports_maskingattribute. Example:tf.keras.layers.GlobalMaxPool1D().supports_masking - Masking logic is: skip a timestep if all features are equal to the masked value (TF source code uses
not_equalandanyto flag what remains)
import tensorflow ast f
arr = np.arange(6).reshape((1,6,1))
arr_masked = tf.keras.layers.Masking(mask_value=5)(arr)
print(arr_masked._keras_mask)
print(arr_masked.numpy())
Solution 3:[3]
I think you have to pass the mask from layer to layer in a subclassing layer. From the Tensorflow documentation: Quote
Note that in the call method of a subclassed model or layer, masks aren't automatically propagated, so you will need to manually pass a mask argument to any layer that needs one.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Community |
| Solution 2 | Maciej S. |
| Solution 3 |
