'How to update a single number in a vector output during neural network training?

Let there be a neural network, a MLP for simplicity. It outputs a vector of size n.

The task I'm solving requires me to have the network, for a given input, change its output vector from [y_1, y_2, ..., y_n] to [z, y_2, ..., y_n], i.e. update one of its output values from some y to some z and keep the rest unchanged.

I've tried doing this using gradient descent with the MSE loss between the whole vectors [y_1, y_2, ..., y_n] and [z, y_2, ..., y_n] but unfortunately found that, since the loss between the equal elements is zero, the gradient is affected only by the differing elements, rendering the trick redundant. Except for MSE loss implementation scaling (such as averaging between elements), the resulting gradient is the same as if the MSE loss was calculated only between elements y and z and backpropagated, which of course means that the outputs y_2 up to y_n change as well. Intuitively, I thought that the network would have to "try" to keep the values unchanged, since the MSE would punish any changes, but that's obviously false now when I analyzed the issue properly.

So, to summarize, I don't have a clue how to make the network limit the change to the values that shouldn't be changed when training on the data only once, single instance, single epoch.

My question is: how do I tackle this problem? I realize that if I used a MLP I could technically implement a custom optimizer which only updated the gradients for the last layer, i. e. the weights which affect only the single changed output neuron (the remaining gradient for the final layer would be zero), but how would I train the rest of the network? Maybe using the MSE loss is a bad idea? I'm also wondering whether this problem can be solved using gradient descent at all. Does anyone know please?

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'How to update a single number in a vector output during neural network training?

Sources

Related Questions