'Is there any tool to create dependency graphs for GitHub private repositories mainly written in Python -only showing the private dependencies?

I'm working in a company that has a couple of private repositories - all of them mainly written in Python.
Each repo has at its root a requirements.txt file.

We are interested in listing the dependencies and dependents for each repository, not the public ones but just our private dependencies and dependents.

Assume for example the requirements.txt file for our_private_package1 looks like this:

public_package1
public_package2
public_package3
our_private_package2
our_private_package3

Also assume there is another package our_private_package4 which has the following requirements.txt file:

public_package2
public_package4
our_private_package1

So we would like to build a dependency graph in which we could see the following for our_private_package1:
Dependencies:

  • our_private_package2
  • our_private_package3

Dependents:

  • our_private_package4

Ideally, we would like to visualize the dependencies with a graph.

2 notes:

  • GitHub's solution is not suitable here for 2 reasons. First, it only shows public dependencies. Also it does not determine the dependents of private repositories (whether the dependents are public or private).
  • Dependencies should no be transitive. With the example above, our_private_package2 should not be listed as dependent for our_private_package4 (although we should see in a visual representation a path from our_private_package2 to our_private_package4, with a directed branch from our_private_package2 to our_private_package1 and another branch from our_private_package1 to our_private_package4).

If there is no available tool for such kind of problem, I would like at least some guidance how to do it with some Python code (almost sure it's feasible).



Solution 1:[1]

As previously mentionned in the comments, I had a similar need. I solved it using a similar approach than this one :

from collections import defaultdict
from typing import List, Tuple, Dict


def main():
    requirements_file_content_per_package = {
        "our_private_package1": """\
public_package1
public_package2
public_package3
our_private_package2
our_private_package3
        """,

        "our_private_package2": """\
public_package2
public_package4
our_private_package1
        """
    }

    private_packages_names = ["our_private_package1",
                              "our_private_package2",
                              "our_private_package3",
                              "our_private_package4"]

    # process descending dependency
    private_packages_dependencies: Dict[str, Tuple[List[str], List[str]]] = {}
    for private_package_name in private_packages_names:
        requirements_file_content = requirements_file_content_per_package.get(private_package_name, "")
        private_packages_dependencies[private_package_name] = classify_dependencies(requirements_file_content, private_packages_names)

    # display
    for private_package_name, (package_private_dependencies_names, _) in private_packages_dependencies.items():
        print(f"package {private_package_name!r} depends on private {package_private_dependencies_names!r}")

    # process ascending dependencies
    private_packages_dependents: Dict[str, List[str]] = defaultdict(list)
    for private_package_name, (package_private_dependencies_names, _) in private_packages_dependencies.items():
        for package_private_dependency_name in package_private_dependencies_names:
            private_packages_dependents[package_private_dependency_name].append(private_package_name)

    # display
    for private_package_name, package_private_dependents_names in private_packages_dependents.items():
        print(f"package {private_package_name!r} is depended on by private {package_private_dependents_names!r}")


def classify_dependencies(requirements_file_content: str, private_packages_names: List[str]) -> Tuple[List[str], List[str]]:
    private_dependencies: List[str] = []
    public_dependencies: List[str] = []
    for dependency_name in requirements_file_content.splitlines():
        is_private = dependency_name in private_packages_names
        if is_private:
            private_dependencies.append(dependency_name)
        else:
            public_dependencies.append(dependency_name)
    return private_dependencies, public_dependencies


main()

It's a bit crude, and I am using very basic Python (strings, lists and sets) but it could be possible to use a directed colored graph instead. Also, the names get very long because I wanted to be descriptive, but I think using a concise defined vocabulary would be better.

As for visualization :

    import graphviz  # installed with pip

    dot = graphviz.Digraph()
    # at first, create all the nodes
    for private_package_name in private_packages_names:
        dot.node(private_package_name)
    # then create the edges
    for private_package_name, package_private_dependents_names in private_packages_dependents.items():
        for package_private_dependent_name in package_private_dependents_names:
            dot.edge(package_private_dependent_name, private_package_name)
    dot.render("deps.gv", format="png")

produced deps.gv.png :
rendering of the dependencies using Dot (Graphviz)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Lenormju