'pytorch dataloader length inconsistent

I was stuck and found something very very abnormal. I am training an image classifier

def train(self, current_epoch=0, is_init=False, fold=0):
    self.logger.info('Current fold: %d' %(fold))

    self.model.train()
    if (not is_init):
        self.scheduler.step()
    self.logger.info('Current epoch learning rate: %e' %(self.optimizer.param_groups[0]['lr']))

    running_loss = 0.0
    running_acc = 0.0
    dataloader = self.dataloader[str(fold)]
    optimizer = torch.optim.Adam(self.model.parameters(), lr=0.0001, weight_decay=0.0001)
    self.logger.info('Len of train loader: '+str(len(dataloader['train'])))
    for i, sample in enumerate(dataloader['train'], 1):
        self.logger.info("i: {}".format(i))
        sample = self.prepare(sample)
        images, labels = sample['img'], sample['label']
        optimizer.zero_grad()
        outputs = self.model(images)
        loss = self.loss_fn(outputs, labels)
        loss.backward()
        optimizer.step()

I found that the output log very strange. enter image description here As shown in the picture above, we can see that the first epoch only gets 2 batches, while the second one gets far more than that. In fact, it gets 12 and epoch 3 gets 15 batches. enter image description here Shouldn't it be the same number of batches? Why are they not 129 as len(dataloader) is 129?

You may find the entire project here. Project

Sincerely thank you in advance. So frustrating..



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source