'TensorFlow vs PyTorch convolution confusion

I am confused on how to replicate Keras (TensorFlow) convolutions in PyTorch.

In Keras, I can do something like this. (the input size is (256, 237, 1, 21) and the output size is (256, 237, 1, 1024).

import tensorflow as tf
x = tf.random.normal((256,237,1,21))
y = tf.keras.layers.Conv1D(filters=1024, kernel_size=5,padding="same")(x)
print(y.shape) 
(256, 237, 1, 1024)

However, in PyTorch, when I try to do the same thing I get a different output size:

import torch.nn as nn
x = torch.randn(256,237,1,21)
m = nn.Conv1d(in_channels=237, out_channels=1024, kernel_size=(1,5))
y = m(x)
print(y.shape)
torch.Size([256, 1024, 1, 17])

I want PyTorch to give me the same output size that Keras does:

This previous question seems to imply that Keras filters are PyTorch's out_channels but thats what I have. I tried to add the padding in PyTorch of padding=(0,503) but that gives me torch.Size([256, 1024, 1, 1023]) but that still not correct. This also takes so much longer than keras does so I feel that I have incorrectly assigned a parameter.

How can I replicate what Keras did with convolution in PyTorch?



Solution 1:[1]

In TensorFlow, tf.keras.layers.Conv1D takes in a tensor of shape (batch_shape + (steps, input_dim)). Which means that what is commonly known as channels appears on the last axis. For instance in 2D convolution you would have (batch, height, width, channels). This is different from PyTorch where the channel dimension is right after the batch axis: torch.nn.Conv1d takes in shapes of (batch, channel, length). So you will need to permute two axes.

For torch.nn.Conv1d:

  • in_channels is the number of channels in the input tensor
  • out_channels is the number of filters, i.e. the number of channels the output will have
  • stride the step size of the convolution
  • padding the zero-padding added to both sides

In PyTorch there is no option for padding='same', you will need to choose padding correctly. Here stride=1, so padding must equal to kernel_size//2 (i.e. padding=2) in order to maintain the length of the tensor.


In your example, since x has a shape of (256, 237, 1, 21), in TensorFlow's terminology it will be considered as an input with:

  • a batch shape of (256, 237),
  • steps=1, so the length of your 1D input is 1,
  • 21 input channels.

Whereas in PyTorch, x of shape (256, 237, 1, 21) would be:

  • batch shape of (256, 237),
  • 1 input channel
  • a length of 21.

Have kept the input in both examples below (TensorFlow vs. PyTorch) as x.shape=(256, 237, 21) assuming 256 is the batch size, 237 is the length of the input sequence, and 21 is the number of channels (i.e. the input dimension, what I see as the dimension on each timestep).

In TensorFlow:

>>> x = tf.random.normal((256, 237, 21))
>>> m = tf.keras.layers.Conv1D(filters=1024, kernel_size=5, padding="same")
>>> y = m(x)
>>> y.shape
TensorShape([256, 237, 1024])

In PyTorch:

>>> x = torch.randn(256, 237, 21)
>>> m = nn.Conv1d(in_channels=21, out_channels=1024, kernel_size=5, padding=2)
>>> y = m(x.permute(0, 2, 1))
>>> y.permute(0, 2, 1).shape
torch.Size([256, 237, 1024])

So in the latter, you would simply work with x = torch.randn(256, 21, 237)...

Solution 2:[2]

PyTorch now has out of the box same convolution operation you can take a look at this link [Same convolution][1]

class InceptionNet(nn.Module):
    def __init__(self, in_channels, in_1x1, in_3x3reduce, in_3x3, in_5x5reduce, in_5x5, in_1x1pool):
        super(InceptionNet, self).__init__()
        self.incep_1 = ConvBlock(in_channels, in_1x1, kernel_size=1, padding='same')

Note a same convolution only supports the default stride value which is 1 anything other won't work. [1]: https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Allaye