'Efficient way to create a weighted graph with networkx where weights are intersection of appearances between reviewers identities?
I am analyzing Amazon's reviews dataset, and I have, customers IDs, their reviews on different products, and products' identifiers as well.
The data can be represented by:
| Customer | Product | Review | ... |
|---|---|---|---|
| 1 | A | .... | |
| 1 | B | .... | |
| 2 | A | .... | |
| 2 | C | .... |
I want to create a weighted undirected graph using networkx, where each node would be a product, and the weights between nodes (products) would be the number of different customers that reviewed the two products.
The data is huge, so I was wondering if there is a feasible way to update the current weights of a network iteratively when going product by product.
Another desirable representation of this graph would be, for the example above,
| A | B | C | |
|---|---|---|---|
| A | 2 | 1 | 1 |
| B | 1 | 1 | 0 |
| C | 1 | 0 | 1 |
EDIT: Mistakenly wrote the (A,C)=2. Replaced it with 1.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
