'calculate average intensity of equivalent rows defined to have hkl, -hk-l, -h-k-l, h-kl
I have data like this:
'''
data = {'h': [1, 1, -1, 0, 0, 1, -1, 5, -5, 5],
'k': [-1, 1, 1, -1, 1, 2, 2, 3, -3, 3],
'l': [-1, -1, 1, 1, 1, -1, 1, -1, 1, 1],
'Intensity': [138, 186, 124, 159, 176, 194, 199, 172, 129, 134]}
df1 = pd.DataFrame(data)
print(df1)
'''
I would like to find the average intensity of symmetry equivalent rows which are defined to be: hkl, -hk-l, -h-k-l, h-kl. I know I should use groupby() and then apply mean(), but it seems confusing to apply this condition. I tried using iterrows but still did not get what I wanted. Note that I can not use absolute values of h,k,l because this would include 4 more possibilities eg. h-k-l which is not equivalent to the group above. The expected result is:
Solution 1:[1]
IIUC, you could group using the absolute values of the h/k/l columns:
cols = ['h', 'k', 'l']
df1['average'] = (df1.groupby(df1[cols].abs().apply(tuple, axis=1))
['Intensity'].transform('mean')
)
alternative syntax:
df1['average'] = (df1.assign(**{k: df1[k].abs() for k in cols})
.groupby(cols)['Intensity'].transform('mean')
)
Or, if "Intensity" is always positive:
df1['average'] = df1.abs().groupby(['h', 'k', 'l'])['Intensity'].transform('mean')
output:
h k l Intensity average
0 1 -1 -1 138 149.333333
1 1 1 -1 186 149.333333
2 -1 1 1 124 149.333333
3 0 -1 1 159 167.500000
4 0 1 1 176 167.500000
5 1 2 -1 194 196.500000
6 -1 2 1 199 196.500000
7 5 3 -1 172 145.000000
8 -5 -3 1 129 145.000000
9 5 3 1 134 145.000000
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | mozway |

