'pytorch: simple recurrent neural network for image classification

I am making a simple recurrent neural network architecture for CIFAR10 image classification. I am not interested not use pre-defined RNN class in PyTorch because i am implementing from scratch according to figure. I am getting input tensor errors in the same device. I am not sure whether my code is right or wrong. Any simple way to write FC layer without defining shape and hard coded parameters.

Figure

enter image description here

Code

class RNN(nn.Module):
    def __init__(self, input_size, hidden_size, output_size, num_classes):
        super(RNN, self).__init__()
        self.hidden_size = hidden_size
        self.input_to_hidden = nn.Linear(in_features=input_size + hidden_size, out_features=output_size)
        self.input_to_output = nn.Linear(in_features=input_size + hidden_size, out_features=output_size)
        self.softmax = nn.LogSoftmax(dim=1)
        self.fc = nn.Linear(hidden_size, num_classes)

    def forward(self, input_tensor):
        combined = torch.cat((input_tensor, torch.zeros(input_tensor.size(0))), 1)
        hidden = self.input_to_hidden(combined)
        output = self.input_to_output(combined)
        output = self.softmax(output)
        return output, hidden

Trackback

Traceback (most recent call last):
  File "/media/cvpr/CM_1/tutorials/rnn.py", line 81, in <module>
    outputs = model(images)
  File "/home/cvpr/anaconda3/envs/tutorials/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/media/cvpr/CM_1/tutorials/rnn.py", line 33, in forward
    combined = torch.cat((input_tensor, torch.zeros(input_tensor.size(0))), 1)
RuntimeError: All input tensors must be on the same device. Received cuda:0 and cpu


Solution 1:[1]

You need to make sure the tensors are on the same device (cpu/gpu) before you are contacting them

you can add a device parameter to your class and use it:

    class RNN(nn.Module):
    def __init__(self, input_size, hidden_size, output_size, num_classes, device='cuda'):
        super(RNN, self).__init__()
        self.device = device
        self.hidden_size = hidden_size
        self.input_to_hidden = nn.Linear(in_features=input_size + hidden_size, out_features=output_size)
        self.input_to_output = nn.Linear(in_features=input_size + hidden_size, out_features=output_size)
        self.softmax = nn.LogSoftmax(dim=1)
        self.fc = nn.Linear(hidden_size, num_classes)

    def forward(self, input_tensor):
        combined = torch.cat((input_tensor.to(device), torch.zeros(input_tensor.size(0), device=self.device)), 1)
        hidden = self.input_to_hidden(combined)
        output = self.input_to_output(combined)
        output = self.softmax(output)
        return output, hidden

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Ophir Yaniv