'calculate average intensity of equivalent rows defined to have hkl, -hk-l, -h-k-l, h-kl

I have data like this:
'''

data = {'h': [1, 1, -1, 0, 0, 1, -1, 5, -5, 5],
                'k': [-1, 1, 1, -1, 1, 2, 2, 3, -3, 3],
                'l': [-1, -1, 1, 1, 1, -1, 1, -1, 1, 1],
                'Intensity': [138, 186, 124, 159, 176, 194, 199, 172, 129, 134]}
    df1 = pd.DataFrame(data)
    print(df1)

'''

I would like to find the average intensity of symmetry equivalent rows which are defined to be: hkl, -hk-l, -h-k-l, h-kl. I know I should use groupby() and then apply mean(), but it seems confusing to apply this condition. I tried using iterrows but still did not get what I wanted. Note that I can not use absolute values of h,k,l because this would include 4 more possibilities eg. h-k-l which is not equivalent to the group above. The expected result is:

expected output



Solution 1:[1]

IIUC, you could group using the absolute values of the h/k/l columns:

cols = ['h', 'k', 'l']
df1['average'] = (df1.groupby(df1[cols].abs().apply(tuple, axis=1))
                     ['Intensity'].transform('mean')
                  )

alternative syntax:

df1['average'] = (df1.assign(**{k: df1[k].abs() for k in cols})
                     .groupby(cols)['Intensity'].transform('mean')
                  )

Or, if "Intensity" is always positive:

df1['average'] = df1.abs().groupby(['h', 'k', 'l'])['Intensity'].transform('mean')

output:

   h  k  l  Intensity     average
0  1 -1 -1        138  149.333333
1  1  1 -1        186  149.333333
2 -1  1  1        124  149.333333
3  0 -1  1        159  167.500000
4  0  1  1        176  167.500000
5  1  2 -1        194  196.500000
6 -1  2  1        199  196.500000
7  5  3 -1        172  145.000000
8 -5 -3  1        129  145.000000
9  5  3  1        134  145.000000

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 mozway