'Nested Group By (a group by within a group by)

I'm just wondering is it possible that I can use python to get a group by within another group by? Let's say I have a list of tuple:

[
('Simon', 'Brown', 'Playing computer games', 'USA', 'New York City', '2022-01-02', 'Win'),
('Simon', 'Brown', 'Playing computer games', 'USA', 'San Francisco City', '2022-01-03', 'Lose'),
('Simon', 'Brown', 'Playing piano games', 'China', 'Beijing City', '2022-01-02', 'Lose'),
('Simon', 'Brown', 'Playing piano games', 'China', 'Shanghai City', '2022-01-03', 'Lose'),
('Simon', 'Brown', 'Playing piano games', 'Australia', 'Sydney City', '2022-01-03', 'Lose'),
('Trudy', 'Porter', 'Playing piano games', 'Australia', 'Sydney City', '2022-01-03', 'Lose')
]

Is it possible to get the results like the following (it's more like firstly group by based on the first and second item in each tuple, and then within each group, further group by based on the third and fourth items, and then list the rest of the items of each tuple in each line-- with space):

Simon Brown
Playing computer games-USA
New York City            2022-01-02              Win
San Francisco City       2022-01-03              Lose

Playing piano games  China
Beijing City             2022-01-02              Lose
Shanghai City            2022-01-03              Lose

Playing piano games Australia
Sydney City             2022-01-03               Lose

Trudy Porter 
Playing piano games Australia
Sydney City             2022-01-03                Lose

enter image description here



Solution 1:[1]

There's a groupby tool in the standard library, which you could use in combination with itemgetter as key-function. You'd get something like this:

from itertools import groupby
from operator import itemgetter

data = [<your list>]

for key1, group1 in groupby(data, key=itemgetter(0, 1)):
    print(*key1)
    for key2, group2 in groupby(group1, key=itemgetter(2, 3)):
        print("\t", *key2)
        for items in group2:
            print("\t\t", *items[4:])

This prints:

Simon Brown
     Playing computer games USA
         New York City 2022-01-02 Win
         San Francisco City 2022-01-03 Lose
     Playing piano games China
         Beijing City 2022-01-02 Lose
         Shanghai City 2022-01-03 Lose
     Playing piano games Australia
         Sydney City 2022-01-03 Lose
Trudy Porter
     Playing piano games Australia
         Sydney City 2022-01-03 Lose

If you want to format datetime objects (here at index 5) then you could adjust that to

from datetime import datetime
...
        ...
        for items in group2:
            print("\t\t", *(
                    item.strftime("%Y-%m-%d") if isinstance(item, datetime)
                    else item for item in items[4:]
                )
            )

Edit: Adjusted according to @KellyBundy's hint.

Solution 2:[2]

This should work as long as you don't have any outcomes with the same player, date, location and activity:

import pandas as pd

d = [<your list>]
c = ['First', 'Last', 'Activity', 'Country', 'City', 'Date', 'Outcome']
df = pd.DataFrame(data = d, columns = c)
df.pivot(index = c[:5], columns = c[-2]).stack()

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Professor Pantsless