'Incremental metrics in pydeequ

In aws deequ framework for scala, there is this very nice example of stateful calculation, which makes deequ a very interesting candidate for our data engineering pipelines: https://github.com/awslabs/deequ/blob/master/src/main/scala/com/amazon/deequ/examples/algebraic_states_example.md

However, currently our complete pipeline is in Python and I did not find this feature in pydeequ.

I mostly find this stackoverflow post, which to me seems to go past the problem: PyDeequ - incremental metrics collection (the problem in my mind is not appending the report, but keeping the internal state of the checks in a state, where I don't have to query the complete dataset to get a metric like a mean for example).

Is there a simple way of reproducing something similar in Python as in the link above?

Thank you very much in advance!



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source