'Applying weights to KNN dimensions

When doing a KNN searches in ES/OS it seems to be recommended to normalize the data in the knn vectors to prevent single dimensions from over powering the the final scoring.

In my current example I have a 3 dimensional vector where all values are normalized to values between 0 and 1

[0.2, 0.3, 0.2]

From the perspective of Euclidian distance based scoring this seems to give equal weight to all dimensions.

In my particular example I am using an l2 vector:

"method": {
            "name": "hnsw",
            "space_type": "l2",
            "engine": "nmslib",
          }

However, if I want to give more weight to one of my dimensions (say by a factor of 2), would it be acceptable to single out that dimension and normalize between 0-2 instead of the base range of 0-1?

Example:

[0.2, 0.3, 1.2] // Third vector is now between 0-2

The distance computation for this term would now be (2 * (xi - yi))^2 and lead to bigger diffs compared to the rest. As a result the overall score would be more sensitive to differences in this particular term.

In OS the score is calculated as 1 / (1 + Distance Function) so the higher the value returned from the distance function, the lower the score will be.

Is there a method to deciding what the weighting range should be? Setting the range too high would likely make the dimension too dominant?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source