'Why does inputting a tensor into a neural network fail to get an output?

I'm new to deep learning and trying to reproduce a neural renderer program.

FCN() is a neural renderer network that renders a 10-dimensional stroke parameter into a stroke on a 128x128 canvas. renderer.pkl is the network parameter trained by the author, the size of the input is batchsize x 10, here I assume batchsize==1; finally, five strokes are rendered on the canvas at a time.

Stroke generation I use random generation, when print(action), you can see that it is a non-zero [5,10] shape tensor.

But I don't know why canvas1 is still in the state of canvas0 (all 0), which means that the renderer is not working at all ???

I would appreciate if you could give a help.

import cv2
import torch
import numpy as np
import torch.nn as nn
import torch.nn.functional as F

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

class FCN(nn.Module): 
    def __init__(self):
        super(FCN, self).__init__()
        self.fc1 = (nn.Linear(10, 512))
        self.fc2 = (nn.Linear(512, 1024))
        self.fc3 = (nn.Linear(1024, 2048))
        self.fc4 = (nn.Linear(2048, 4096))
        self.conv1 = (nn.Conv2d(16, 32, 3, 1, 1))
        self.conv2 = (nn.Conv2d(32, 32, 3, 1, 1))
        self.conv3 = (nn.Conv2d(8, 16, 3, 1, 1))
        self.conv4 = (nn.Conv2d(16, 16, 3, 1, 1))
        self.conv5 = (nn.Conv2d(4, 8, 3, 1, 1))
        self.conv6 = (nn.Conv2d(8, 4, 3, 1, 1))
        self.pixel_shuffle = nn.PixelShuffle(2)

    def forward(self, x): # b x 10
        x = F.relu(self.fc1(x)) #512
        x = F.relu(self.fc2(x)) #1024
        x = F.relu(self.fc3(x)) #2048
        x = F.relu(self.fc4(x)) #4096
        x = x.view(-1, 16, 16, 16)  #reshape b x16x16x16
        x = F.relu(self.conv1(x))        # b x16x16x32
        x = self.pixel_shuffle(self.conv2(x))  # b x16x16x32 (8x2x2) -> b x32x32x8
        # (*, C×r^2, H, W)(∗,C×r,H×r,W×r)
        x = F.relu(self.conv3(x))        # b x32x32x16
        x = self.pixel_shuffle(self.conv4(x))  # b x32x32x16 -> b x64x64x4
        x = F.relu(self.conv5(x))        # b x64x64x8
        x = self.pixel_shuffle(self.conv6(x))  # b x64x64x4 -> b x128x128x1
        x = torch.sigmoid(x)
        return 1 - x.view(-1, 128, 128)

Decoder = FCN() 
Decoder.load_state_dict(torch.load('renderer.pkl'))

def decode(x, canvas): # b * 10
    x = x.view(-1, 10)
    stroke = 1 - Decoder(x[:, :10]) 
    stroke = stroke.view(-1, 128, 128, 1) #bsz x 128 x 128 x 1
    stroke = stroke.permute(0, 3, 1, 2) # b x1x128x128
    stroke = stroke.view(-1, 5, 1, 128, 128) 
    for i in range(5):
        canvas = canvas * (1 - stroke[:, i])
    return canvas

width = 128 
canvas0 = torch.zeros([ 1, width, width], dtype=torch.uint8).to(device)
action=[]
for i in range(5): # 5 x 10
    f = np.random.uniform(0,1,10)
    action.append(f)
action = torch.tensor(action).float()
action = action.unsqueeze(0)  # 1 x 5 x 10

canvas1 = decode(action, canvas0)  # 1 x 1 x 128 x 128
canvas1 = torch.squeeze(canvas1, dim=0)
canvas1 = torch.squeeze(canvas1, dim=0) # 128 x 128
canvas1 = canvas1.detach().numpy()
Image.fromarray(np.uint8(canvas1))

print(canvas1)


Solution 1:[1]

I suspect the issue may be caused by the np.uint8 line. Most neural networks are parameterized such that their outputs fall in the range [0,1]. In this case, yours is (sigmoid activation function range is [0,1]). Casting any float in this range as an int will truncate it to 0. Try multiplying by 255 first so that the range of outputs is over a non-trivial space after you cast as an integer.

...
canvas1 = decode(action, canvas0)  # 1 x 1 x 128 x 128
canvas1 = torch.squeeze(canvas1, dim=0)
canvas1 = torch.squeeze(canvas1, dim=0) # 128 x 128
canvas1 = canvas1.detach().numpy()  * 255
Image.fromarray(np.uint8(canvas1))

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 DerekG