'Custom training steps with sliding window in tensorflow (from pytorch)
I'm working on a custom transformer model where the training steps method goes like this:
#simplified version of my training method. where model = myTransformerModel()
for windows in data: #step through data
l1 = model(window)
loss = torch.mean(l1)
optimizer.zero_grad()
loss.backward(retain_graph=True)
optimizer.step()
scheduler.step()
I'm trying to recreate this in TensorFlow, currently its like this:
for windows in data: #step through data
with tf.GradientTape() as tape:
l1 = model.call(window)
loss = tf.reduce_mean(l1)
train = optimizer.minimize(loss, var_list=model.trainable_variables,tape=tape)
This functions, but causes the scheduler to step with every window, which throws off the learning rate.
I have also tried this in place of the minimize line:
gradients = tape.gradient(loss, model.trainable_variables)
optimizer.apply_gradients(zip(gradients, model.trainable_variables))
Is there a good way to make the TensorFlow model more like the PyTorch one? Is there a better way to implement my steps with the gradienttape?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
