'why the value of X_train, y_train and x_test and y_test become - 100 after I put windowed_dataset in python (prediction with deep learning )

i have a problem about my code , i don't know why the value of xtrain ytrain xtest ytest diminue 100 (time_step) - 1 because i have keep the same value like this (((1237, 100), (1237,), (310, 100), (310,)))

train_data, test_data = price_series_scaled[0:1237], price_series_scaled[1237:]

len(train_data)  1237
len(test_data)   310

train_data.shape, test_data.shape
((1237, 1), (310, 1))

def windowed_dataset(series, time_step):
    dataX, dataY = [], []
    for i in range(len(series)- time_step-1):
        a = series[i : (i+time_step), 0]
        dataX.append(a)
        dataY.append(series[i+ time_step, 0])
        
    return np.array(dataX), np.array(dataY)

X_train, y_train = windowed_dataset(train_data, time_step=100)
X_test, y_test = windowed_dataset(test_data, time_step=100)


X_train.shape, y_train.shape, X_test.shape, y_test.shape
((1136, 100), (1136,), (209, 100), (209,))


Solution 1:[1]

It is windows length and inside value alignments, my understanding you try to extract the features from audio or target with windows length 100.

[ Sample ] :

import numpy as np
import math
import tensorflow as tf

import matplotlib.pyplot as plt

contents = tf.io.read_file("F:\\temp\\Python\\Speech\\temple_of_love-sisters_of_mercy.wav")
audio, sample_rate = tf.audio.decode_wav(
    contents, desired_channels=-1, desired_samples=-1, name=None

train_data, test_data = audio[50 * 1237:51 * 1237].numpy(), audio[52 * 1237:53 * 1237].numpy()

def windowed_dataset(series, time_step):
    dataX, dataY = [], []
    for i in range( math.ceil( len(series) / time_step ) ):
        source = ( time_step * i )
        dest = time_step * ( i + 1 )
        a = series[source : dest, 0]
        dataX.append(a)
        dataY.append(series[source : dest, 0])
    
    return np.array(dataX), np.array(dataY)

X_train, y_train = windowed_dataset(train_data, time_step=100)
X_test, y_test = windowed_dataset(test_data, time_step=100)

plt.plot(X_train[1])
plt.show()
plt.close()

print(X_train.shape)
print(y_train.shape)
print(X_test.shape)
print(y_test.shape)

[ Output ] :

[ 0.06628418  0.13339233  0.09823608  0.03137207 -0.00985718 -0.08621216
 -0.04876709  0.08459473  0.09558105  0.08746338  0.03610229  0.13031006
  0.12753296  0.08270264  0.08920288  0.18014526  0.08901978  0.05679321
 -0.00701904 -0.04037476 -0.07434082 -0.07824707 -0.15322876 -0.1824646
 -0.0944519  -0.07226562 -0.02203369 -0.17202759 -0.18380737 -0.18643188
 -0.02816772 -0.03457642 -0.06304932  0.01519775  0.09963989  0.09661865
  0.04107666 -0.01071167  0.02893066  0.05361938  0.08685303  0.06866455
  0.03787231  0.00048828  0.14135742  0.08670044  0.05126953 -0.03884888
  0.09957886  0.19561768  0.21575928  0.1807251   0.18737793  0.09906006
  0.15802002  0.02886963  0.05886841  0.12005615  0.17202759  0.14172363
  0.08731079  0.00262451 -0.04882812 -0.05090332 -0.01583862  0.04284668
  0.01327515 -0.04296875  0.01281738  0.04425049  0.02297974 -0.0032959
  0.03491211 -0.02828979  0.05282593 -0.02893066 -0.09103394 -0.09231567
 -0.06265259  0.13113403  0.11938477  0.09963989  0.10992432  0.02728271
  0.06658936  0.13491821  0.09960938  0.03689575  0.09088135  0.17120361
  0.13201904  0.06710815 -0.04443359 -0.0506897  -0.05752563 -0.03656006
 -0.06747437 -0.16769409 -0.26519775 -0.22238159]
(13,)
(13,)
(13,)
(13,)

Sample

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Martijn Pieters