'Adjacency List Creation from a Blog Post

I need to make an adjacency list to show the relationship between two users from a Topic thread. My dataset consists of two columns: User ID and Topic ID. The Topic ID is like a blog post so many users can post on it. The dataset looks like the below:

User ID Topics ID
1 55
2 55
1 55
6 55

I need to make an adjacency list from this so I just have the Users and their relationship like below:

User User
1 2
1 6
2 6

Any ideas on how to do this within excel or python?



Solution 1:[1]

We'll get by with a little help from our friends collections.defaultdict and itertools.combinations:

from collections import defaultdict
from itertools import combinations

by_post_id = defaultdict(set)

data = [
    (1, 55),
    (2, 55),
    (1, 55),
    (6, 55),
    (1, 42),
    (11, 42),
    (8, 42),
]

# Group up people by post ID
for user_id, post_id in data:
    by_post_id[post_id].add(user_id)

# (`by_post_id` will look like {55: {1, 2, 6}, 42: {8, 1, 11}})

# Walk over each post...
for post_id, user_ids in by_post_id.items():
    # ... and generate all pairs of user IDs.
    for combo in combinations(user_ids, 2):
        print(post_id, combo)

This outputs

55 (1, 2)
55 (1, 6)
55 (2, 6)
42 (8, 1)
42 (8, 11)
42 (1, 11)

and naturally, if you don't care about the pairs' post_ids, just ignore it.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 AKX