'Neo4j - GDS - FastRP Algorithm - Same values but different embeddings
While using the FastRP algorithm, a phrase in the documentation caught my attention. I also faced this situation.
Phrase: Because of L2 normalization which is applied to each iteration (here only one iteration), all nodes have the same embedding despite having different age values (apart from rounding errors).
When getting embedding with FastRP on a graph (Let's consider only the properties, that is, propertyRatio = 1), how can the embedding of 2 nodes with exactly the same values be the different? In the link I shared above, this was explained as if it was a normal situation, but it seemed a bit inconvenient to me.
Solution 1:[1]
If there is a single node property value and propertyRatio
of 1.0, then the embeddings are identical. However, as soon as you add more node properties or lower the propertyRatio
, the values of node properties come into play.
One thing to note is that node values are normalized node by node, so if you use propertyRatio
of 1 with the following nodes:
(a:Person {age: 10, numberOfPets: 1}), (b:Person {age: 100, numberOfPets: 10})
The embeddings will still be identical. However for example the (c:Person {age: 10, numberOfPets: 10})
would have a different embedding.
As far as I understand, the node values are normalized prior to being used in the FastRP algorithm as to not overpower the original fastRP embeddings (the network position encoding).
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Tomaž BrataniÄ |