'How to calculate covariance using numpy by taking only the first row of the result?
I want to use numpy to calculate which vector is the closest to the input one among a large number of vectors, I want to use the covariance as the basis of the calculation. But numpy.cov calculates the covariance of all combinations of any pair in the matrix by default, this can waste a lot of unnecessary calculations when the input matrix is large.
# Desired Result
input vector -------------> alternative vector 1 ----------> result vector
[f64, f64...] | [f64, f64...] | [f64, ...n]
├------> alternative vector 2 -----┤
| [f64, f64...] |
├------> alternative vector 3 -----┤
| [f64, f64...] |
... ...
n vectors
I am now adopting the following practices
>>> import numpy as np
>>> input_vec = np.random.normal(size=(1,100))
>>> vecs = np.random.normal(size=(10000,100))
>>> covmatrix = np.cov(np.vstack((input_vec, vecs)), bias=True)
>>> covmatrix.shape
(10001, 10001)
>>> result = covmatrix[0]
Is there any way to speed up the calculation? Thanks
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
