'transitive closure python tuples
Does anyone know if there's a python builtin for computing transitive closure of tuples?
I have tuples of the form (1,2),(2,3),(3,4) and I'm trying to get (1,2),(2,3),(3,4),(1,3)(2,4)
Thanks.
Solution 1:[1]
There's no builtin for transitive closures.
They're quite simple to implement though.
Here's my take on it:
def transitive_closure(a):
closure = set(a)
while True:
new_relations = set((x,w) for x,y in closure for q,w in closure if q == y)
closure_until_now = closure | new_relations
if closure_until_now == closure:
break
closure = closure_until_now
return closure
call:
transitive_closure([(1,2),(2,3),(3,4)])
result:
set([(1, 2), (1, 3), (1, 4), (2, 3), (3, 4), (2, 4)])
call:
transitive_closure([(1,2),(2,1)])
result:
set([(1, 2), (1, 1), (2, 1), (2, 2)])
Solution 2:[2]
Just a quick attempt:
def transitive_closure(elements):
elements = set([(x,y) if x < y else (y,x) for x,y in elements])
relations = {}
for x,y in elements:
if x not in relations:
relations[x] = []
relations[x].append(y)
closure = set()
def build_closure(n):
def f(k):
for y in relations.get(k, []):
closure.add((n, y))
f(y)
f(n)
for k in relations.keys():
build_closure(k)
return closure
Executing it, we'll get
In [3]: transitive_closure([(1,2),(2,3),(3,4)])
Out[3]: set([(1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3, 4)])
Solution 3:[3]
We can perform the "closure" operation from a given "start node" by repeatedly taking a union of "graph edges" from the current "endpoints" until no new endpoints are found. We need to do this at most (number of nodes - 1) times, since this is the maximum length of a path. (Doing things this way avoids getting stuck in infinite recursion if there is a cycle; it will waste iterations in the general case, but avoids the work of checking whether we are done i.e. that no changes were made in a given iteration.)
from collections import defaultdict
def transitive_closure(elements):
edges = defaultdict(set)
# map from first element of input tuples to "reachable" second elements
for x, y in elements: edges[x].add(y)
for _ in range(len(elements) - 1):
edges = defaultdict(set, (
(k, v.union(*(edges[i] for i in v)))
for (k, v) in edges.items()
))
return set((k, i) for (k, v) in edges.items() for i in v)
(I actually tested it for once ;) )
Solution 4:[4]
Suboptimal, but conceptually simple solution:
def transitive_closure(a):
closure = set()
for x, _ in a:
closure |= set((x, y) for y in dfs(x, a))
return closure
def dfs(x, a):
"""Yields single elements from a in depth-first order, starting from x"""
for y in [y for w, y in a if w == x]:
yield y
for z in dfs(y, a):
yield z
This won't work when there's a cycle in the relation, i.e. a reflexive point.
Solution 5:[5]
Here's one essentially the same as the one from @soulcheck that works on adjacency lists rather than edge lists:
def inplace_transitive_closure(g):
"""g is an adjacency list graph implemented as a dict of sets"""
done = False
while not done:
done = True
for v0, v1s in g.items():
old_len = len(v1s)
for v2s in [g[v1] for v1 in v1s]:
v1s |= v2s
done = done and len(v1s) == old_len
Solution 6:[6]
If you have a lot of tupels (more than 5000), you might want to consider using the scipy code for matrix powers (see also http://www.ics.uci.edu/~irani/w15-6B/BoardNotes/MatrixMultiplication.pdf)
from scipy.sparse import csr_matrix as csr
def get_closure(tups):
index2id = list(set([tup[0] for tup in tups]) | set([tup[1] for tup in tups]));
id2index = {index2id[i]:i for i in xrange(len(index2id))};
tups_re = tups + [(index2id[i],index2id[i],) for i in xrange(len(index2id))]; # Unfortunately you have to make the relation reflexive first - you could also add the diagonal to M
M = csr( ([True for tup in tups_re],([id2index[tup[0]] for tup in tups_re],[id2index[tup[1]] for tup in tups_re])),shape=(len(index2id),len(index2id)),dtype=bool);
M_ = M**n; # n is maximum path length of your relation
temp = M_.nonzero();
#TODO: You might want to remove the added reflexivity tupels again
return [(index2id[temp[0][i]],index2id[temp[1][i]],) for i in xrange(len(temp[0]))];
In the best case, you can choose n wisely if you know a bit about your relation/graph -- that is how long the longest path can be. Otherwise you have to choose M.shape[0], which might blow up in your face.
This detour also has its limits, in particular you should be sure than the closure does not get too large (the connectivity is not too strong), but you would have the same problem in the python implementation.
Solution 7:[7]
You can create a graph from those tuples then use connnected components algorithm from the created graph. Networkx is library that supports connnected components algorithm.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | |
| Solution 3 | Karl Knechtel |
| Solution 4 | |
| Solution 5 | Dave Abrahams |
| Solution 6 | |
| Solution 7 |
