'Pandas how to group rows by a dictionary of {row : group}
I have a dataframe n rows:
1 2 3
3 4 1
5 3 2
9 8 2
7 2 6
0 0 0
4 4 4
8 4 1
...
and a dictionary of keys , so that row is a key and the value is the group:
d = {0 : 0 , 1: 0, 2 : 0, 3 : 1, 4 : 1, 5: 2, 6: 2}
I want to group by the keys and then apply mean on the groups. So I will get:
3 3 2 #This is the mean of rows 0,1,2 from the original df, as d[0]=d[1]=d[2]=0
8 5 4
2 2 2
8 4 1
What is the best way to do so?
Solution 1:[1]
Simply use the dictionary in the groupby it will replace the index value by the dictionary value matching on the key:
df.groupby(d).mean()
output:
a b c
0.0 3.0 3.0 2.0
1.0 8.0 5.0 4.0
2.0 2.0 2.0 2.0
If you also want to get the missing keys, use dropna=False in groupby. Those keys will be listed in the 'NaN' group:
df.groupby(d, dropna=False).mean()
output:
a b c
0.0 3.0 3.0 2.0
1.0 8.0 5.0 4.0
2.0 2.0 2.0 2.0
NaN 8.0 4.0 1.0
And for a range index instead of the dictionary keys:
df.groupby(d, dropna=False, as_index=False).mean()
output:
a b c
0 3.0 3.0 2.0
1 8.0 5.0 4.0
2 2.0 2.0 2.0
3 8.0 4.0 1.0
used input:
a b c
0 1 2 3
1 3 4 1
2 5 3 2
3 9 8 2
4 7 2 6
5 0 0 0
6 4 4 4
7 8 4 1
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
