'The size of tensor a (3) must match the size of tensor b (32) at non-singleton dimension 1 [closed]

I was training a deep learning model but i am encountering the error like The size of tensor a (3) must match the size of tensor b (32) at non-singleton dimension 1.And also while training the data the accuracy is above 1 that means i am getting the accuracy like 1.04,1.06 like that. The Below is the Training Code

def train(model,criterion,optimizer,iters):
    epoch = iters
    train_loss = []
    validaion_loss = []
    train_acc = []
    validation_acc = []
    states = ['Train','Valid']
    for epoch in range(epochs):
        print("epoch : {}/{}".format(epoch+1,epochs))
        for phase in states:
            if phase == 'Train':
                model.train()
                dataload = train_data_loader
            else:
                model.eval()
                dataload = valid_data_loader

            run_loss,run_acc = 0,0
            for data in dataload:
                inputs,labels = data
                #print("Inputs:",inputs.shape)
                #print("Labels:",labels.shape)
                inputs = inputs.to(device)
                labels = labels.to(device)
                labels = labels.byte()
                optimizer.zero_grad()
            
                with torch.set_grad_enabled(phase == 'Train'):
                    outputs = model(inputs)
                    print("Outputs",outputs.shape)
                    loss = criterion(outputs,labels)
                
                    predict = outputs>=0.5
                    #print("Predict",predict.shape)
                    if phase == 'Train':
                        loss.backward()
                        optimizer.step()

                    acc = torch.sum(predict == labels.data)

                run_loss+=loss.item()
                #print("Running_Loss",run_loss)
                run_acc+=acc.item()/len(labels)
                #print("Running_Acc",run_acc)
            if phase == 'Train':
                epoch_loss = run_loss/len(train_data_loader)
                train_loss.append(epoch_loss)
                epoch_acc = run_acc/len(train_data_loader)
                train_acc.append(epoch_acc)
            else:
                epoch_loss = run_loss/len(valid_data_loader)
                validaion_loss.append(epoch_loss)
                epoch_acc = run_acc/len(valid_data_loader)
                validation_acc.append(epoch_acc)
        
            print("{}, loss :{},accuracy:{}".format(phase,epoch_loss,epoch_acc))
    
    history = {'Train_loss':train_loss,'Train_accuracy':train_acc,
               'Validation_loss':validaion_loss,'Validation_Accuracy':validation_acc}
    return model,history

Below is the code of the base model

model = models.resnet34(pretrained = True)

for param in model.parameters():
  param.requires_grad = False

model.fc = nn.Sequential(nn.Linear(model.fc.in_features,out_features = 1024),nn.ReLU(),
                         nn.Linear(in_features = 1024,out_features = 512),nn.ReLU(),
                         nn.Dropout(0.3),
                         nn.Linear(in_features=512,out_features=256),nn.ReLU(),
                         nn.Linear(in_features = 256,out_features = 3),nn.LogSoftmax(dim = 1))

device = torch.device("cuda" if cuda.is_available() else "cpu")
print(device)
model.to(device)

optimizer = optim.Adam(model.parameters(),lr = 0.00001)
criterion = nn.CrossEntropyLoss()

I tried with predict == labels.unsqueeze(1) it didn't raise any error but the accuracy is going over 1. May I know where i had to change the code. enter image description here



Solution 1:[1]

Your output tensor of the size [32,3] 32 is the number of mini-batches and 3 is the output of your neural network e.g.

[[0.25, 0.45, 0.3],
 [0.45, 0.15, 0.4],
      ....
      ....
 [0.2, 0.15, 0.65]]

When you compare whether output >= 0.5 the result is predict tensor but it is bool tensor with the same size of output [32,3] like that:

[[False, False, False],
 [False, False, False],
      ....
      ....
 [False, False, True]]

and Labels is 0D tensor with 32 values e.g.

[0,2,...,0]

The cause of the problem is here: to compare between predicts and labels, you should select the index of the maximum probability from each row in predicts tensor like that:

predicts = predicts.argmax(1) 
# output 
[0,0,...,2]

But predicts is bool tensor, and you cannot apply argmax to bool tensor directly. Therefore, you got the error message as you indicated in the comment. To solve this problem, you need only to do the following:

 predicts = (output >= 0.5)*1

Now you can compare between the two tensors predicts, labels, because both have the same size.

In brief, you should use:

predicts = (output >= 0.5)*1
acc = torch.sum(predicts.argmax(1) == Labels)

Your problem is solved, but the accuracy is logically not correct. Therefore, be careful if you want to use sigmoid with multi-classification problem because you use output >= 0.5 however, you have 3 classes in the output. This is not correct because imagine you have in the output [0.15,0.45,0.4]. Your predict will be [0, 0, 0] and then argmax(1) will select the first index if there are equal numbers, however, the second index should be selected in this case because it has the largest probability. The best way to do that if you have a multi-classification problem is to use softmax instead of sigmoid (>= 0.5).

By the way, if you come back to your model structure (last line), you will find that you already used nn.LogSoftmax. You need just to remove this line predicts = (outputs >= 0.5) and use directly:

#before the for loop
num_corrects  = 0

# inside the for loop
num_corrects = num_corrects  + torch.sum(outputs.argmax(1) == Labels)

#outside the loop
train_accuracy = (100. * num_corrects / len(train_loader.dataset))

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1