'Pandas group by one column and repeat the values of another column
I was trying to divide the month into two weeks. Basically for each month i am trying to create week numbers like 1,2,3,4 and repeat them.
How to create the required column like below:
import numpy as np
import pandas as pd
df = pd.DataFrame({'Year_Month':['2020-jan']*6+['2020-feb']+['2021-jan']*2+['2021-jan']+['2021-mar']*5})
df['B'] = [1]*6+[5]+[1]*2+[1]+[10]*5
df['groubyA_repeatB'] = [1,2,3,4,1,2,
5,
1,2,3,
10,11,12,13,10] # groupby column A and repeat [i,i+1,i+2,i+3]
# e.g. for group A: we have 8 'a' and value of a is 1, so we need to repeat [1,2,3,4] until we get 8 elements.
df
Year_Month B groubyA_repeatB # how to create this column
0 2020-jan 1 1
1 2020-jan 1 2
2 2020-jan 1 3
3 2020-jan 1 4
4 2020-jan 1 1
5 2020-jan 1 2
6 2020-feb 5 5
7 2021-jan 1 1
8 2021-jan 1 2
9 2021-jan 1 3
10 2021-mar 10 10
11 2021-mar 10 11
12 2021-mar 10 12
13 2021-mar 10 13
14 2021-mar 10 10
Solution 1:[1]
You can utilize cycle to create cycle for list and slice to get specific count
from itertools import cycle, islice
out = (df
.groupby(['Year_Month', 'B'])
.apply(lambda g: g.assign(groubyA_repeatB_=list(islice(cycle(range(g.iloc[0]['B'], g.iloc[0]['B']+4)), len(g))))))
print(out)
Year_Month B groubyA_repeatB groubyA_repeatB_
0 2020-jan 1 1 1
1 2020-jan 1 2 2
2 2020-jan 1 3 3
3 2020-jan 1 4 4
4 2020-jan 1 1 1
5 2020-jan 1 2 2
6 2020-feb 5 5 5
7 2021-jan 1 1 1
8 2021-jan 1 2 2
9 2021-jan 1 3 3
10 2021-mar 10 10 10
11 2021-mar 10 11 11
12 2021-mar 10 12 12
13 2021-mar 10 13 13
14 2021-mar 10 10 10
Solution 2:[2]
Try:
#get the total count per group and generate the repeated 1-4 series
counts = df.groupby("Year_Month",sort=False)["B"].count().apply(lambda x: [1,2,3,4]*(x//4)+[1,2,3,4][:x%4])
#explode to split lists into individual rows
counts = counts.explode().reset_index()
#add to "B" value
df["groupbyA_repeatB"] = counts["B"].add(df["B"]).sub(1)
>>> df
Year_Month B groupbyA_repeatB
0 2020-jan 1 1
1 2020-jan 1 2
2 2020-jan 1 3
3 2020-jan 1 4
4 2020-jan 1 1
5 2020-jan 1 2
6 2020-feb 5 5
7 2021-jan 1 1
8 2021-jan 1 2
9 2021-jan 1 3
10 2021-mar 10 10
11 2021-mar 10 11
12 2021-mar 10 12
13 2021-mar 10 13
14 2021-mar 10 10
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Ynjxsjmh |
| Solution 2 | not_speshal |
