'PyTorch second derivative is zero everywhere

I'm working on a custom loss function that requires me to compute the trace of the hessian matrix of a neural network with respect to its inputs.

In one dimension, this reduces to the second derivative of the network with respect to its input. Suppose my network is u and my input is x. The problem is completely one-dimensional, so the relationship I'm trying to model is u(x) where u has one input and one output, and x is one-dimensional. However, I'm running with batches, so x "comes in" as a column vector and I'm expecting a column vector as output.

If we label the samples in the batch as x_1, x_2, ..., x_n, I'm thus interested in the following vectors:

enter image description here enter image description here

In PyTorch, I have tried the following:

u = model(x)
d = grad(u, x, grad_outputs=torch.ones_like(u), create_graph=True)[0]
dd = grad(d, x, grad_outputs=torch.ones_like(d), retain_graph=True, create_graph=False)[0]

This works well for u' but u'' comes out as being zero everywhere:

enter image description here

I'm a bit confused on the whole notion of needing a vector jacobian product here to begin with. Should I view the computation as u mapping from R^n to R^n, where n is the size of my batch?

Any help is appreciated!



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source