'Adjacency List Creation from a Blog Post
I need to make an adjacency list to show the relationship between two users from a Topic thread. My dataset consists of two columns: User ID and Topic ID. The Topic ID is like a blog post so many users can post on it. The dataset looks like the below:
| User ID | Topics ID |
|---|---|
| 1 | 55 |
| 2 | 55 |
| 1 | 55 |
| 6 | 55 |
I need to make an adjacency list from this so I just have the Users and their relationship like below:
| User | User |
|---|---|
| 1 | 2 |
| 1 | 6 |
| 2 | 6 |
Any ideas on how to do this within excel or python?
Solution 1:[1]
We'll get by with a little help from our friends collections.defaultdict and itertools.combinations:
from collections import defaultdict
from itertools import combinations
by_post_id = defaultdict(set)
data = [
(1, 55),
(2, 55),
(1, 55),
(6, 55),
(1, 42),
(11, 42),
(8, 42),
]
# Group up people by post ID
for user_id, post_id in data:
by_post_id[post_id].add(user_id)
# (`by_post_id` will look like {55: {1, 2, 6}, 42: {8, 1, 11}})
# Walk over each post...
for post_id, user_ids in by_post_id.items():
# ... and generate all pairs of user IDs.
for combo in combinations(user_ids, 2):
print(post_id, combo)
This outputs
55 (1, 2)
55 (1, 6)
55 (2, 6)
42 (8, 1)
42 (8, 11)
42 (1, 11)
and naturally, if you don't care about the pairs' post_ids, just ignore it.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | AKX |
