'Feature importance in neural networks with multiple differently shaped inputs in pytorch and captum (classification)

I have developed a model with three inputs types. Image, categorical data and numerical data. For Image data I've used ResNet50 for the other two I develop my own network.

class MulticlassClassification(nn.Module):
    def __init__(self, cat_size, num_col, output_size, layers, p=0.4):
        super(MulticlassClassification, self).__init__()
        
        # IMAGE: ResNet
        self.cnn = models.resnet50(pretrained = True)
        for param in self.cnn.parameters():
            param.requires_grad = False
        n_inputs = self.cnn.fc.in_features
        self.cnn.fc = nn.Sequential(
          nn.Linear(n_inputs, 250), 
          nn.ReLU(), 
          nn.Dropout(p),
          nn.Linear(250, output_size),                   
          nn.LogSoftmax(dim=1)
        )
        
        # TABULAR 
        self.all_embeddings = nn.ModuleList(
            [nn.Embedding(categories, size) for categories, size in cat_size]
        )
        self.embedding_dropout = nn.Dropout(p)
        self.batch_norm_num = nn.BatchNorm1d(num_col)

        all_layers = []
        num_cat_col = sum(e.embedding_dim for e in self.all_embeddings)
        input_size = num_cat_col + num_col

        for i in layers:
            all_layers.append(nn.Linear(input_size, i))
            all_layers.append(nn.ReLU(inplace=True))
            all_layers.append(nn.BatchNorm1d(i))
            all_layers.append(nn.Dropout(p))
            input_size = i

        all_layers.append(nn.Linear(layers[-1], output_size))

        self.layers = nn.Sequential(*all_layers)
        
        #combine
        self.combine_fc = nn.Linear(output_size * 2, output_size)

    def forward(self, image, x_categorical, x_numerical):
        embeddings = []
        for i, embedding in enumerate(self.all_embeddings):
            print(x_categorical[:,i])
            embeddings.append(embedding(x_categorical[:,i]))
        x = torch.cat(embeddings, 1)
        x = self.embedding_dropout(x)

        x_numerical = self.batch_norm_num(x_numerical)
        x = torch.cat([x, x_numerical], 1)
        x = self.layers(x)
        
        # img
        x2 = self.cnn(image)
        
        # combine
        x3 = torch.cat([x, x2], 1)
        x3 = F.relu(self.combine_fc(x3))
        
        return x

Now after successful training I would like to calculate integrated gradients by using the captum library.

from captum.attr import IntegratedGradients
ig = IntegratedGradients(model)

testiter = iter(testloader)
img, stack_cat, stack_num, target = next(testiter)
attributions_ig = ig.attribute(inputs=(img.cuda(), stack_cat.cuda(), stack_num.cuda()), target=target.cuda())

And here I got an error:

RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got torch.cuda.FloatTensor instead (while checking arguments for embedding)

I figured out that captum injects a wrongly shaped tensor into my x_categorical input (with the print in my forward method). It seems like captum only sees the first input tensor and uses it's shape for all other inputs. How can I change this behaviour?

I've found the similar issue here (https://github.com/pytorch/captum/issues/439). It was recommended to use Interpretable Embedding for categorical data. When I used it I got this error:

IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

I would be very grateful for any tips and advises how to combine all three inputs and to solve my problem.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'Feature importance in neural networks with multiple differently shaped inputs in pytorch and captum (classification)

Sources

Related Questions