'Find standard deviation of a column based of values from another column and group by
I have a data frame looking like this:
classid grade haveTeacher
0 99 1
1 40 1
1 50 0
1 70 1
2 50 0
3 34 0
I'd like to find out what I could write in pandas to find out the standard deviation of "grade" across classid that have a teacher (1 means there is a teacher). I know we would have to groupby "classid", but I was wondering what would go inside the .apply and lambda function to fulfill all these conditionals?
Solution 1:[1]
You might first want to get the dataframe with records having teacher - df[df['haveteacher'] == 1]. Once you get this you can do a groupby(classid) and use numpy.std (import numpy as np before that ) function to find the standard devitation of that group
so you have -
>>> df[df['haveteacher'] == 1].groupby(['classid']).agg({'grade': np.std})
output is -
grade
classid
0 NaN
1 21.213203
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Rajarshi Ghosh |
