'loaded model provides large loss, why not continued training?

It's my first time to run ML model in pytorch. My data is very large and I am trying to save the model after some iterations, and continue the training in case I the program crashed in the middle so i have to start over. So I am checking if the loaded model can continue the training. I trained the model twice (epoch in range (2), and save & load the model to continue the training (epoch in range(2,4). but I found (1) the model did not update. The loss form epoch 3 and 4 are identical (2) the new loss is very large. It seems that I did not save the model correctly, or did not load model. Thanks in advance!

print("Trainloss", Trainloss) 

print("R-square", r_squared)

*Trainloss [1470600.5, 0.8635099530220032]
R-square [-10589209.551380618, -5.086684550040197]*


save_path="./savedmodel.pth"           
EPOCH = epoch
TRAIN_LOSS = Trainloss
Rsquare=r_squared
loss=loss
torch.save({
            'epoch': EPOCH,
            'Rsquare': Rsquare,
            'loss': loss,
            'model_state_dict': model.state_dict(),
            'optimizer_state_dict': optimizer.state_dict(),
            'TRAIN_LOSS': Trainloss,
            }, save_path)

device = torch.device("cuda")
optimizer = optim.Adam(model.parameters(), lr = lr)
model = NeuralNetwork()
lr = 1e-2
checkpoint = torch.load(save_path)
model.load_state_dict(checkpoint['model_state_dict'])
optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
loss = checkpoint['loss']
r_squared=checkpoint['Rsquare']
Trainloss=checkpoint['TRAIN_LOSS']
epoch=checkpoint['epoch']
model.to(device)
model.train()

print(Trainloss)
print(r_squared)
  
*trainloss: [1470600.5, 0.8635099530220032, 439699.84375, 439699.84375]
r2: [-10589209.551380618, -5.086684550040197, -3161013.534033724, -3161013.534033724]*

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'loaded model provides large loss, why not continued training?

Sources

Related Questions