'Python: get a frequency count based on two columns (variables) in pandas dataframe some row appers
Hello I have the following dataframe.
Group Size
Short Small
Short Small
Moderate Medium
Moderate Small
Tall Large
I want to count the frequency of how many time the same row appears in the dataframe.
Group Size Time
Short Small 2
Moderate Medium 1
Moderate Small 1
Tall Large 1
Solution 1:[1]
Update after pandas 1.1 value_counts now accept multiple columns
df.value_counts(["Group", "Size"])
You can also try pd.crosstab()
Group Size
Short Small
Short Small
Moderate Medium
Moderate Small
Tall Large
pd.crosstab(df.Group,df.Size)
Size Large Medium Small
Group
Moderate 0 1 1
Short 0 0 2
Tall 1 0 0
EDIT: In order to get your out put
pd.crosstab(df.Group,df.Size).replace(0,np.nan).\
stack().reset_index().rename(columns={0:'Time'})
Out[591]:
Group Size Time
0 Moderate Medium 1.0
1 Moderate Small 1.0
2 Short Small 2.0
3 Tall Large 1.0
Solution 2:[2]
Other posibbility is using .pivot_table() and aggfunc='size'
df_solution = df.pivot_table(index=['Group','Size'], aggfunc='size')
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | asantz96 |
