'How do I join on two columns from multiple csv files in neo4j?
This is my first time using neo4j and I have spent past 3.5 hours in figuring out on how to join two csv files.
sample_user contains
Now I simply want for each user_id what were my top 2 products by timestamp(recency of timestamp). For that I am joining sample_user to sample_order on user_id to get the other info sample_order contains. Is that the correct approach?
I have tried this till now but nothing seems to work!
LOAD CSV with headers FROM 'file:///sample_users.csv' AS row with row where row.user_phone_number is not null
merge(n:user_id {Name:row.user_id})
merge(m:user_phone_number{Name:row.user_phone_number})
merge(l:location{Name:row.location})
with *
LOAD CSV with headers FROM 'file:///sample_orders.csv' AS order_row
merge(o:order_id {Name:order_row.order_id})
merge(t:order_ts{Name:order_row.order_ts})
merge(u:user_id{Name:order_row.user_id})
merge(p:product{Name:order_row.product})
merge(u) - [:TO {x:order_row.distance}]->(o)
Different sites offering different ways by using SET and define uniqueness. Please help!
UPDATE!!! I could merge on the same table like
LOAD CSV with headers FROM 'file:///sample_orders.csv' AS order_row with order_row where order_row.product is not null
merge(o:order_id {Name:order_row.order_id})
merge(t:order_ts{Name:order_row.order_ts})
merge(u:user_id{Name:order_row.user_id})
merge(p:product{Name:order_row.product})
merge(u) - [:TO {x:order_row.order_id}]->(o)
I just want to merge this order_row.user_id with other table's user_id too in order to get their first_name, location. Basically merge both the csv files.
Solution 1:[1]
After spending hours: i finally cracked this. Query to merge two csv files on user ID.
//load user nodes
LOAD CSV WITH HEADERS FROM 'file:///sample_users.csv' AS row
MERGE(u:User {user_id:row.user_id, name:row.first_name, location:row.location})
RETURN count(u);
//load order nodes
LOAD CSV WITH HEADERS FROM 'file:///sample_orders.csv' AS row WITH row WHERE row.order_id IS NOT NULL
MERGE(o:Orders {order_id:row.order_id, order_ts:row.order_ts, user_id_2:row.user_id, product:row.product})
RETURN count(o);
//Create relationships
LOAD CSV WITH HEADERS FROM 'file:///sample_users.csv' AS row
MATCH(u:User{user_id:row.user_id})
MATCH(o:Orders{user_id_2:row.user_id})
MERGE(u)-[:HAS_ORDERED]->(o)
RETURN *;
Solution 2:[2]
I found this somewhere,it may help
1.load all nodes in two commands (one command does not really work; first the people described in the first two columns and then the ones from the second two columns):
a) LOAD CSV WITH HEADERS FROM 'file:///FileName.csv' as row MERGE (a:Person {name: row.Person, number: row.Num_pers1}) b) LOAD CSV WITH HEADERS FROM 'file:///FileName.csv' as row MERGE (a:Person {name: row.Person2, number: row.Num_pers2})
Connect the nodes:
a) LOAD CSV WITH HEADERS FROM 'file:///FileName.csv' as row MERGE (a:Person {name: row.Person, number: row.Num_pers1}) b) LOAD CSV WITH HEADERS FROM 'file:///FileName.csv' as row MERGE (a:Person {name: row.Person2, number: row.Num_pers2})
You can then see all the connections from e.g. Marc: MATCH (a {name: "Marc"})--(b) return a,b
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | coder_bg |
| Solution 2 | swapclick |


