'Neo4j cosine similarity with condition
I want to calculate and write as node property cosine similarity in neo4j in telecom domain. The difficulty I encountered is writing a specific condition. I need to find max cosine similarity between customers of our telecom and customers of non our telecoms.
I am using 1.8 version of neo4j GDS library
Our telecom has a phone number prefixes: 707, 700, 747, 708
Let’s define a graph
CREATE (a1:Abon {msisdn:'7071212121'})
CREATE (a2:Abon {msisdn:'7071313131'})
CREATE (a3:Abon {msisdn:'7071414141'})
CREATE (b1:Abon {msisdn:'7011010101'})
CREATE (b2:Abon {msisdn:'7012323232'})
CREATE (a1)-[:con {weight: 0.98}]->(a2)
CREATE (a1)-[:con {weight: 0.98}]->(b1)
CREATE (a1)-[:con {weight: 0.98}]->(b2)
CREATE (b1)-[:con {weight: 0.98}]->(a2)
CREATE (b2)-[:con {weight: 0.98}]->(a3)
Now I am looking for a1, I need to calculate cosine similarity only for customers of another telecoms:
a1 <> b1 = 0.98
a1 <> b2 = 0.76
After max similarity is taken which is 0.98 and saved as node parameter of a1.
I have started to write some script, but I can't getting it right
MATCH (a1:Abon), (a2:Abon)
MATCH (a1)-[conn:con]-(a2)
WITH {item:id(a1), weights: collect(coalesce(conn.weight, gds.util.NaN()))} AS abonData
WITH collect(abonData) AS Abons
WITH Abons,
// [value in Abons WHERE value.msisdn IN ['7072501005', '7072501006'] | value.item ] AS sourceIds
[value in Abons WHERE value.msisdn IN ['7071206336', '7013193795'] | value.item ] AS targetIds
CALL gds.alpha.similarity.cosine.write({
data: Abons,
// sourceIds: sourceIds,
targetIds: targetIds,
topK: 1
})
YIELD item1, item2, similarity
WITH gds.util.asNode(item1) AS from, gds.util.asNode(item2) AS to, similarity
RETURN from.msisdn AS from, to.msisdn AS to, similarity
ORDER BY similarity DESC
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
