'RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x20 and 1x1)

I am new to AI and have followed a tutorial to build a linear regression model. When I go to run the code ,I get a RuntimeError which says "mat1 and mat2 shapes cannot be multiplied (1x20 and 1x1)", I think the x_train variable may be the reason causing this problem, but I don't know how to fix it. Here is the code of this program.

import torch
import torch.utils.data
import torch.nn as nn
import matplotlib.pyplot as plt


x_train = torch.arange(1,21,dtype=torch.float32)
true_w = 2.0
true_b = 3
y_true = torch.tensor(true_w*x_train+true_b,dtype=torch.float32)
y_train = y_true-torch.normal(mean=0,std=5,size=(1,20))

net = nn.Sequential(
    nn.Linear(1,1)
)
epochs = 1000
learning_rate = 0.01
optimizer = torch.optim.SGD(net.parameters(),lr=learning_rate

for epoch in range(epochs):
    optimizer.zero_grad()
    # outputs = net(x_train)
    # loss  = nn.MSELoss(outputs,y_true)
# line below caused RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x20 and 1x1)
    outputs = net(x_train)
    loss = nn.MSELoss(outputs,y_true)
    if epoch %100==0:
        print('epoch:{},loss:{}'.format(epoch,loss.item()))
    loss.backward()
    loss.step()



Solution 1:[1]

Your x_train has shape torch.Size([20]), which your model thinks is a vector with 20 features, but your model is defined as nn.Linear(1,1) so it's expecting an input of size 1.

If you want x_train to be a batch of 20 examples with size 1, you can use unsqueeze() to add an extra dimension:

x_train = torch.arange(1,21,dtype=torch.float32).unsqueeze(1)
print(x_train.shape)

The output will be torch.Size([20, 1]), so your input now has a batch dimension of 20, with each example being size 1.

I think there's a couple of other things to fix in your training loop too:

  • You should create the MSELoss object first, and then call it during the loop rather than re-creating it every epoch.
  • loss.step() will throw an error: you should call optimizer.step() instead.

The final code will look like this:

import torch
import torch.utils.data
import torch.nn as nn
import matplotlib.pyplot as plt

x_train = torch.arange(1, 21, dtype=torch.float32).unsqueeze(1)
true_w = 2.0
true_b = 3
y_true = torch.tensor(true_w*x_train+true_b, dtype=torch.float32)
y_train = y_true - torch.normal(mean=0, std=5, size=(1,20))

net = nn.Sequential(nn.Linear(1, 1))
epochs = 1000
learning_rate = 0.01
optimizer = torch.optim.SGD(net.parameters(), lr=learning_rate)
criterion = nn.MSELoss()

for epoch in range(epochs):
    optimizer.zero_grad()
    outputs = net(x_train)
    loss = criterion(outputs, y_true)
    if epoch %100==0:
        print('epoch:{},loss:{}'.format(epoch,loss.item()))
    loss.backward()
    optimizer.step()

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Adam Montgomerie