'Discrepancy between log_prob and manual calculation
I want to define multivariate normal distribution with mean [1, 1, 1] and variance covariance matrix with 0.3 on diagonal. After that I want to calculate log likelihood on datapoints [2, 3, 4]
By torch distributions
import torch
import torch.distributions as td
input_x = torch.tensor([2, 3, 4])
loc = torch.ones(3)
scale = torch.eye(3) * 0.3
mvn = td.MultivariateNormal(loc = loc, scale_tril=scale)
mvn.log_prob(input_x)
tensor(-76.9227)
From scratch
By using formula for log likelihood:
We obtain tensor:
first_term = (2 * np.pi* 0.3)**(3)
first_term = -np.log(np.sqrt(first_term))
x_center = input_x - loc
tmp = torch.matmul(x_center, scale.inverse())
tmp = -1/2 * torch.matmul(tmp, x_center)
first_term + tmp
tensor(-24.2842)
My question is - what's the source of this discrepancy?
Solution 1:[1]
You are passing the covariance matrix to the scale_tril instead of covariance_matrix. From the docs of PyTorch's Multivariate Normal
scale_tril (Tensor) – lower-triangular factor of covariance, with positive-valued diagonal
So, replacing scale_tril with covariance_matrix would yield the same results as your manual attempt.
In [1]: mvn = td.MultivariateNormal(loc = loc, covariance_matrix=scale)
In [2]: mvn.log_prob(input_x)
Out[2]: tensor(-24.2842)
However, it's more efficient to use scale_tril according to the authors:
...Using scale_tril will be more efficient:
You can calculate the lower choelsky using torch.cholesky
In [3]: mvn = td.MultivariateNormal(loc = loc, scale_tril=torch.cholesky(scale))
In [4]: mvn.log_prob(input_x)
Out[4]: tensor(-24.2842)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |


