'how to use pandas groupby with None and NaN treated as separate values

Is it possible for pandas groupby to treat Nones and NaNs as separate entities?

Here is an example:

df = pd.DataFrame([
    [np.nan, 5],
    [None, 10],
    ['a', 7],
    [np.nan, 5],
    [None, 10]
])

Out:
      0   1
0   NaN   5
1  None  10
2     a   7
3   NaN   5
4  None  10

df.groupby(0, dropna=False).mean()

Out:
       1
0       
a    7.0
NaN  7.5

However, I want to achieve the following result:

       1
0       
a    7.0
NaN  5.0
None 10.0

EDIT: an 'ideal' solution to this problem should:

  • be generalisable to grouping with multiple columns
  • does not (potentially) conflate other items. E.g. converting everything to strings would mean that None and 'None' become conflated (or '7' and 7, or...)

Alternatively, explaining why the task cannot be done 'tidily' would also appreciated, so one can instead think about 'hacky' solutions.



Solution 1:[1]

Not very nice, but possible converting to strings:

print (df.groupby(df[0].astype(str)).mean())
       1
0       
None  10
a      7
nan    5

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1