'Oracle Cube Replica in Python Pandas

Oracle has a cube function for groupby clause which takes 2 or more columns and groups the results set in all possible combinations of columns passed to the cube function.

SELECT Time, Region, Department, 
   sum(Profit) AS Profit FROM sales
   GROUP BY CUBE  (Time, Region, Dept)

https://docs.oracle.com/cd/F49540_01/DOC/server.815/a68003/rollup_c.htm#32311

Is there a similar function in python pandas or any other python framework which takes in list of pandas columns and produce a new data frame which has results of all possible group by.

I am coding my own version of cube in python. Good to know if already exists one.



Solution 1:[1]

In Pandas you should automatize a function that concatenates horizontally the results of the following groupby

pd.concat([
     df.groupby("Time")['Profit].sum().reset_index(),
     df.groupby("Profit")['Profit].sum().reset_index(),
     df.groupby("Region")['Profit].sum().reset_index(),
     df.groupby(["Time", "Region"])['Profit].sum().reset_index(),
     df.groupby(["Time", "Dept"])['Profit].sum().reset_index(),
     df.groupby(["Region", "Dept"])['Profit].sum().reset_index(),
     df.groupby(["Time", "Region", "Dept"])['Profit].sum().reset_index(),
    ],
    axis=0
)

To do this I suggest using itertools.combinations

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 linello