'Why not have Tensorflow keras compute the loss separately for each sample and then compute the total loss for the batch's samples?

I want to train our model on variable length input size. Every input in a given batch and across batches has different dimensions. For example (200, 5, 1), (33, 5, 1), (1000, 5, 1) etc.

As far as I am aware, we must ensure that input is uniform within the batch. I m not sure, but it it seems that Tensorflow performs the computation on a tensor of the entire batch, requiring us to pad the inputs of all the batch samples and equalize all the samples.

Thus, why not have tensor flow keras compute the loss separately for each sample and then compute the total loss for the batch's samples? Yes, this disables parallelism and slows down the process, but at the very least we can handle variable length input without adding any padding values? I'm actually not sure whether this is possible or not? What you think?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source