'pandas groupby comparing string value with previous row value and spot changes in new columns

I have this demo df:

info = {'customer': ['Jason', 'Jason', 'Jason', 'Jason',
                     'Molly', 'Molly', 'Molly', 'Molly'], 
'Good': ['Cookie', 'Cookie', 'Cookie', 'Cookie','Ice Cream', 
         'Ice Cream', 'Ice Cream', 'Ice Cream'],
'Date' :['2021-12-14','2022-01-04','2022-01-11','2022-01-18',
         '2022-01-12','2022-01-15','2022-01-19','2022-01-30'],
'Flavor' :['Chocolate','Vanilla','Vanilla','Strawberry',
           'Chocolate', 'Vanilla', 'Caramel', 'Caramel']}
df = pd.DataFrame(data=info)
df

gives:

   customer   Good      Date        Flavor
0   Jason   Cookie      2021-12-14  Chocolate
1   Jason   Cookie      2022-01-04  Vanilla
2   Jason   Cookie      2022-01-11  Vanilla
3   Jason   Cookie      2022-01-18  Strawberry
4   Molly   Ice Cream   2022-01-12  Chocolate
5   Molly   Ice Cream   2022-01-15  Vanilla
6   Molly   Ice Cream   2022-01-19  Caramel
7   Molly   Ice Cream   2022-01-30  Caramel

I am trying to track the change in flavor per customer per Goods in new columns From - To. I did the grouping part:

   df.sort_values(['Date']).groupby(['customer','Good','Date'])['Flavor'].sum()

I got:

 customer  Good       Date      
    Jason     Cookie     2021-12-14     Chocolate
                         2022-01-04       Vanilla
                         2022-01-11       Vanilla
                         2022-01-18    Strawberry
    Molly     Ice Cream  2022-01-12     Chocolate
                         2022-01-15       Vanilla
                         2022-01-19       Caramel
                         2022-01-30       Caramel
    Name: Flavor, dtype: object

The first row per group is the entry point and then I would like to compare the next change per group, if it is different then we track the change in new columns (from & to) and if similar values nothing happens.

I have tried multiple methods & codes but unfortunately I do not know the best way to do so.

Expected output considering reset_index():

  customer   Good      Date        Flavor           From         To
0   Jason   Cookie      2021-12-14  Chocolate    
1   Jason   Cookie      2022-01-04  Vanilla         Chocolate    Vanilla
2   Jason   Cookie      2022-01-11  Vanilla
3   Jason   Cookie      2022-01-18  Strawberry      Vanilla      Strawberry
4   Molly   Ice Cream   2022-01-12  Chocolate
5   Molly   Ice Cream   2022-01-15  Vanilla         Chocolate    Vanilla
6   Molly   Ice Cream   2022-01-19  Caramel         Vanilla      Caramel
7   Molly   Ice Cream   2022-01-30  Caramel


Solution 1:[1]

s=df.assign(
             
             
             From = df.sort_values(by='Date').groupby(['customer',  'Good'])['Flavor'].apply(lambda x: x.shift(1)),
             To = df['Flavor']
).dropna()

out = df.join(s[s['From'] != s['To']].iloc[:,-2:]).fillna('')




   customer       Good        Date      Flavor       From          To
0    Jason     Cookie  2021-12-14   Chocolate                       
1    Jason     Cookie  2022-01-04     Vanilla  Chocolate     Vanilla
2    Jason     Cookie  2022-01-11     Vanilla                       
3    Jason     Cookie  2022-01-18  Strawberry    Vanilla  Strawberry
4    Molly  Ice Cream  2022-01-12   Chocolate                       
5    Molly  Ice Cream  2022-01-15     Vanilla  Chocolate     Vanilla
6    Molly  Ice Cream  2022-01-19     Caramel    Vanilla     Caramel
7    Molly  Ice Cream  2022-01-30     Caramel                       

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1