'How do I properly make a tf Dataset for recurrent-neural-network?
I have just preprocessed my categories data to one-hot encoding and used tf.argmax(). This way, I got it into a range of numbers of 1 to 34; thus, there are sequences of, say: [7, 4, 28, 14, 5, 15, 22, 22].
My question comes about how do I make the further dataset preparations. I intend to use 3 numbers in sequence to predict the next 2. So, do I need to map the dataset into features being the first 3 and labels the last 2? Should I batch it with buffer_size 5 to specify the sequence legnth? Lastly, do I keep the one_hot representation over it's transformation into a simpler number?
Solution 1:[1]
No one had answered me in time, so I found the solution some days ago.
Just a note that I'm now using 8 numbers to predict the next 2.
First, to make a prediction of the 2 next steps, I decided that I can predict the label for the 8 initial steps and then make another prediction with the last 7 time steps from the initial steps and use the predicted value as the 8th input. Second, my teacher told me it was better for the rnn to use one-hot, so I mapped the dataset as features of 8 one-hots and a label of 1 one-hot. Third, what impressed me, is that in fact I was able to use batching as a form of grouping the splitted data sequence and by so indicating that the last one-hot is my label.
So here is the code:
INPUT_WIDTH = 8
LABEL_WIDTH = 1
shift = 1
INPUT_SLICE = slice(0, INPUT_WIDTH)
total_window_size = INPUT_WIDTH + shift
label_start = total_window_size - LABEL_WIDTH
LABELS_SLICE = slice(label_start, None)
BATCH_SIZE = INPUT_WIDTH + LABEL_WIDTH
Above are some constants that I got from [1]. The only one that I couldn't understand correctly is the shift var, but set it to 1 and you're fine.
Below, the split function:
def split_window(features):
inputs = features[INPUT_SLICE]
labels = features[LABELS_SLICE]
return inputs, labels
Simple and cute, isn't it? But be clear it was a disgrace to change this function from [1] and make it compatible with my input shape.
Now the dataset:
def create_seq_dataset(data):
ds = tf.data.Dataset.from_tensor_slices(data)
#At this point, my input was a single array of 3644 string categories to be turned
#into one-hot. The below function just one-hots the data and turns it into float32 dtype,
#as required by the rnn, so I won't cover it in detail.
ds = ds.map(get_seq_label)
#As I'm using from_tensor_slices() function, the 3644 number disappears from the shape.
#Now, from shape (1), I've got shape (53) after one-hotting it, beign 53 the number of
#possible categories that I'm working with.
#Here I batch the one-hot data
ds = ds.batch(BATCH_SIZE, drop_remainder=True)
#Shape (53) => (9, 53)
#Without using drop_remainder, the rnn will complain that it can't predict "empty" data.
ds = ds.map(split_window)
#Got features shape (8, 53) and labels shape(53,)
ds = ds.batch(16, drop_remainder=True)
#Got features shape (16, 8, 53) and labels: (16, 53)
return
train_ds = create_seq_dataset(train_df)
for features_batch, labels_batch in train_ds:
print(features_batch.shape)
print(labels_batch.shape)
break
What I got from the print was: (16, 8, 53) and (16, 53)
Lastly, the LSTM:
inputs = layers.Input(name="sequence", shape=(INPUT_WIDTH, 53), dtype='float32')
#Reminder that the batch size is implicit when the input is a dataset, unless the LSTM
#is stateful.
def create_rnn_model():
#From [1]: Shape [batch, time, features] => [batch, time, lstm_units]
#In my case [16, 8, 53] => [16, 8 100], but as the 16 is the implicit batch size, it would
#appear as None if you use rnn_model.summary() after declaring the model
x = layers.LSTM(100, activation='tanh', stateful=False, return_sequences=False, name='LSTM_1')(inputs)
#Got shape after this layer as [16, 100], or [None, 100].
x = layers.Dense(32, activation='relu')(x)
output = layers.Dense(NUM_CLASSES, activation='softmax')(x)
model = Model(inputs = inputs, outputs = output)
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)
model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy'])
return model
rnn_model = create_rnn_model()
[1]. https://www.tensorflow.org/tutorials/structured_data/time_series#4_create_tfdatadatasets
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Snykral fa Ashama |
