'how do i insert row value into column by group

I have a dataframe containing 4000 tickers, resembling:

ticker    close    date_a       date_b      
AMZN      3350     2022-01-03   2022-01-03
AMZN      3400     2022-01-03   2022-01-04
AMZN      3450     2022-01-03   2022-01-05
AMZN      3425     2022-01-03   2022-01-06
NVDA      260      2022-01-05   2022-01-03
NVDA      250      2022-01-05   2022-01-04
NVDA      255      2022-01-05   2022-01-05
NVDA      275      2022-01-05   2022-01-06
AAPL      180      2022-01-04   2022-01-03
AAPL      175      2022-01-04   2022-01-04
AAPL      170      2022-01-04   2022-01-05
AAPL      185      2022-01-04   2022-01-06

I am trying to fill a new column change with values from a specific date_a to compare to date_b with the following code:

df['change'] = (df['date_b']['close'] - df['date_a']) / df['date_a'] * 100

I haven't the faintest idea on how to attempt to do this.

I tried the following code:

df['change'] = df.groupby(['ticker','date_a'])['close'] 

But was getting the error length of values (3) does not match length of index (12)

Expected outcome:

ticker    close    date_a       date_b       date_a_close
AMZN      3350     2022-01-03   2022-01-03   3350
AMZN      3400     2022-01-03   2022-01-04   3350
AMZN      3450     2022-01-03   2022-01-05   3350
AMZN      3425     2022-01-03   2022-01-06   3350
NVDA      260      2022-01-05   2022-01-03   255
NVDA      250      2022-01-05   2022-01-04   255
NVDA      255      2022-01-05   2022-01-05   255
NVDA      275      2022-01-05   2022-01-06   255
AAPL      180      2022-01-04   2022-01-03   175
AAPL      175      2022-01-04   2022-01-04   175
AAPL      170      2022-01-04   2022-01-05   175
AAPL      185      2022-01-04   2022-01-06   175

any advice or help with this would be greatly appreciated.



Solution 1:[1]

If need assign absolute minimum value of close for difference between dates per groups use:

df['change'] = (df['date_b'] - df['date_a']).dt.days.abs()

df['change'] = (df.set_index('close')
                  .groupby(['ticker','date_a'])['change']
                  .transform('idxmin')
                  .to_numpy())
print (df)
   ticker  close     date_a     date_b  change
0    AMZN   3350 2022-01-03 2022-01-03    3350
1    AMZN   3400 2022-01-03 2022-01-04    3350
2    AMZN   3450 2022-01-03 2022-01-05    3350
3    AMZN   3425 2022-01-03 2022-01-06    3350
4    NVDA    260 2022-01-05 2022-01-03     255
5    NVDA    250 2022-01-05 2022-01-04     255
6    NVDA    255 2022-01-05 2022-01-05     255
7    NVDA    275 2022-01-05 2022-01-06     255
8    AAPL    180 2022-01-04 2022-01-03     175
9    AAPL    175 2022-01-04 2022-01-04     175
10   AAPL    170 2022-01-04 2022-01-05     175
11   AAPL    185 2022-01-04 2022-01-06     175

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 jezrael