'Generate a self-documenting flow chart from a call structure in Python

I have a number of small, few-line functions in Python that encode physical relations between quantities. They build on each other, so a script might look like this:

a = f1(x,y)
b = f2(x,a)
c = f3(a,b,z)

with x,y,z some fixed inputs that I know, and c at the final stage a desired model parameter.

I would like to automatically create graphs/flowcharts out of such a piece of code, with each node being a function and each edge corresponding to a return value/argument. Both nodes and edges should be augmented with some kind of doc string, of course.

The motivation is basically both actual visualization and error checking, because I will have many of these tiny networks. I am not interested in a call graph, since I only care about a particular set of functions, not all of them.

I guess one way of dealing with this is to write classes for holding all the meta data of each function (and argument?), and make each function/variable an instance of such a class. What I am unsure about is how I would then extract the data for the graph. Is there a common way of doing this? Is this a good approach at all?

python graph

Solution 1:^[1]

Lets assume your functions are defined in a separate library

todoc/library.py

def f1(x, y):
    """
    f1 is an example concatter

    :param x: Foo (string)
    :param y: Bar (string)

    :return:  FooBar (string)
    """

    return x + y


def f2(x, a):
    """
    f2 is an example multiplier

    :param x: Foo (string)
    :param a: Baz (int)
    :return:  Foo * Baz
    """

    return x * a

and one of your scripts to analyze/document would be

scriptA.py

from todoc.library import f1, f2

x = 'FOO'
y = 'BAR'
z = 3

a = f1(x, y)
b = f2(a, z)

print(b)

Now you can use the following script to analyze your scriptA

analyze_for_doc.py

#!/usr/bin/env python3

import argparse
import ast
from importlib import import_module
from pathlib import Path


class PythonAnalyzer(ast.NodeVisitor):  # Parse python source
    def __init__(self, tree, all_=False, watch=None, recurse=False):
        self._tree = tree
        self._all = all_
        self._recurse = recurse
        self._watch = watch
        self._stack = []

    def run(self):
        self.visit(self._tree)
        return self._stack

    def generic_visit(self, node):
        ncn = node.__class__.__name__
        if (
            (isinstance(self._watch, str)
             and node.__class__.__name__ == self._watch) or
            (isinstance(self._watch, (list, tuple))
             and node.__class__.__name__ in self._watch)
        ):
            self._stack.append(node)
            if self._recurse:
                self._all = True
                super(PythonAnalyzer, self).generic_visit(node)
                self._all = False

        else:
            if self._all:
                self._stack.append(node)

            super(PythonAnalyzer, self).generic_visit(node)

    def show(self, verbose=False):
        print(f'{self.__class__.__name__:<40s} [{len(self._stack):4d}]')
        for i, node in enumerate(self._stack):
            if verbose:
                print(f'{i:4d} {node.__class__.__name__:<30s} '
                      f'{id(node)} {node} {node.__dict__}')
            else:
                print(f'{i:4d} {node.__class__.__name__:<30s} '
                      f'{id(node):<12x} {node}')


def main(opts):
    content = opts.file.open().read()
    tree = ast.parse(content)

    if opts.debug:
        pa = PythonAnalyzer(tree, all_=True)
        pa.run()
        pa.show(verbose=opts.verbose)

    pa = PythonAnalyzer(tree, watch=('Call', 'ImportFrom'))
    stack = pa.run()

    print(f'Filename: {opts.file}', '=' * 70, sep='\n')
    modules = [m
               for m in stack
               if (isinstance(m, ast.ImportFrom)
                   and m.module.startswith('todoc.'))]

    fun_to_document = []
    for module in modules:
        print(f'    Module: {module.module}')
        funs = module.names

        mod = import_module(module.module)

        for fun in funs:
            print(f'        Fun: {fun.name}')
            fun_obj = getattr(mod, fun.name)

            if doc := getattr(fun_obj, '__doc__'):
                for line in doc.splitlines():
                    print(f'           |{line}')
                fun_to_document.append(fun.name)

    print('')

    for call_ in stack:
        if isinstance(call_, ast.Call):
            if call_.func.id not in fun_to_document:
                continue
            print(f'Calling {call_.func.id} in line {call_.lineno} '
                  f'with args={call_.args} kwargs={call_.keywords}')


if __name__ == '__main__':
    parser = argparse.ArgumentParser('analyze python for doc')
    parser.add_argument('file', type=Path)
    parser.add_argument('--debug', action='store_true')
    parser.add_argument('--verbose', action='store_true')
    opts = parser.parse_args()

    main(opts)

Calling analyze_for_doc.py scriptA.py will output

Filename: scriptA.py
======================================================================
    Module: todoc.library
        Fun: f1
           |
           |    f1 is an example concatter
           |
           |    :param x: Foo (string)
           |    :param y: Bar (string)
           |
           |    :return:  FooBar (string)
           |    
        Fun: f2
           |
           |    f2 is an example multiplier
           |
           |    :param x: Foo (string)
           |    :param a: Baz (int)
           |    :return:  Foo * Baz
           |    

Calling f1 in line 7 with args=[<ast.Name object at 0x102a589d0>, <ast.Name object at 0x102ac9460>] kwargs=[] Calling f2 in line 8 with args=[<ast.Name object at 0x102b28850>, <ast.Name object at 0x102b28820>] kwargs=[]

This should give you a starting point how to analyze your python script for the creation of documentation information.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	dgw

'Generate a self-documenting flow chart from a call structure in Python

Solution 1:[1]

Sources

Related Questions

Solution 1:^[1]