'Detecting loops in tree structures (graphs)

I'm writing a library which is configured using a recursive structure.

For the sake of this discussion I'm calling these graph structures a "tree" since there is a defined "root" node and each node can refer to more than "child". When properly configured no loops should exist. It differs a little from a tree because child node can be used in multiple places.

      A                  A
     / \                / \
    B   C              B   C
   / \ / \            / \   \
  D   E   F          D   E  |
                          \ |
                            F

Both of these are acceptable despite the fact that E and F are used multiple times on multiple layers. Nodes can have multiple parents and multiple children but MUST NEVER be their own ancestor.

However

A
|
B
|
A
|
...

Is not acceptable because of the loop.

If my library was to be given a graph with a cycle in it then bad things would happen to the library so I am looking for a way to sanity check the input. I need to determine if recursing through this structure will terminate or if it will get stuck in an infinite loop. In effect I need to look for cycles in a directed graph.



Solution 1:[1]

While writing the question I've realised that this can be done with a modified version of the Hare and Tortoise algorithm.

The modification does require some additional memory where the original algorithm did not.

The basic modification to the algorithm is:

  • Instead of iterating through a list the hare traverses the tree depth first.
  • The "Hare" maintains a list (eg: linked list) of pointers to graph nodes. It adds a node to this list when it recurses in and pops the node off when it recurses out.
  • When the hare adds a node to the path (list) making it an even number of elements, the tortoise steps one forward.
  • When the hare removes a node from the path (list) making it an odd number the tortoise steps one backward.
  • The hare and tortoise nodes are compared every time the hare recurses in and a loop is found if the two are equal. This causes the algorithm to stop
  • If the algorithm traverses the entire tree then there will be no loops.

I'm posting an untested code example for this in C. I'll update it once its tested.

#define HAS_LOOP 1
#define DOES_NOT_HAVE_LOOP 0

// Tree nodes each have an array of children
struct TreeNode {
   // some value, eg:
   int value;
   // child nodes:
   struct TreeNode * nodes;
   int nodeCount;
};

// These structures are used to form a single linked list on which Hair and Tortoise will be evaluated
struct LoopDetectionNode {
    struct TreeNode * treeNode;
    struct LoopDetectionNode * next;
};

static int hasLoopRecursive(struct LoopDetectionNode * hare, struct LoopDetectionNode * tortoise, int isOdd) {
    struct LoopDetectionNode newHare = {
        .next = NULL;
    };
    hare->next = &newHare;
    if (isOdd) tortoise = tortoise->next;
    isOdd = !isOdd;
    for (int i = 0; i < hare->treeNode->nodeCount; i++) {
        newHare.treeNode = hare->treeNode->nodes[i];
        if (newHare.treeNode == tortoise.treeNode || hasLoopRecursive(&newHare, tortoise->next, isOdd) == HAS_LOOP) return HAS_LOOP;
    }
    return DOES_NOT_HAVE_LOOP;
}

int hasLoop(struct TreeNode * node) {
    struct LoopDetectionNode hare = {
        .next = NULL;
    };
    
    struct LoopDetectionNode tortoise = {
        .next = &hare;
        .treeNode = node;
    };
   
    for (int i = 0; i < node->nodeCount; i++) {
        hare.treeNode = node->nodes[i];
        if (hare.treeNode == node || hasLoopRecursive(hare, tortoise, 0) == HAS_LOOP) return HAS_LOOP;
    }
    
    return DOES_NOT_HAVE_LOOP;
}

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1