'Pandas rolling window on a dataframe with a string-type column

I created the following dataframe:

import pandas as pd

d = {'sensor1': [1.4, 1, 0.5, 1, 3 ], 'sensor2': [1.2, 1.5, 0.5, 1, 4 ], 'label':["a", "a", "b", "b", "c" ]}

df = pd.DataFrame(data=d)

As you can see, I have 2 columns with sensor values and another column with the corresponding labels. Now, based on that, I want to create another dataframe in which I have the means of the sensor observations for a window of length 3. In order to achieve this, I applied pandas rolling window function by calling: df_mean = df.rolling(3).mean()

sensor1         sensor2
0   NaN           NaN
1   NaN           NaN
2   0.966667    1.066667
3   0.833333    1.000000
4   1.500000    1.833333

I want, however, to preserve the labels. For each window of 3 the label will be the most frequently occurring one. Any ideas how to achieve this so that I have the same columns in the dataframe as in the beginning but with the new mean values and the corresponding labels (the most frequent label in a window) ? I saw that encoding the labels as integer numbers 1 - 3 will let me use for instance:

df_labels = df["label"].rolling(window = 3).apply(lambda x: x.mode()[0])

which will provide me with the correct labels (but, of course, encoded) and then I will have to merge both dataframes to achieve my goal. However, is there another way to achieve it without encoding the strings?

The final output I want is (given the example above and a rolling window of length 3):

sensor1     sensor2     label
0   NaN           NaN         NaN
1   NaN           NaN         NaN
2   0.966667    1.066667       a
3   0.833333    1.000000       b
4   1.500000    1.833333       b



Solution 1:[1]

If you want to automatize things, you could use select_dtypes:

df2 = (df
 .select_dtypes('number')
 .rolling(3).mean()
 .join(df.select_dtypes(exclude='number'))
)

output:

    sensor1   sensor2 label
0       NaN       NaN     a
1       NaN       NaN     a
2  0.966667  1.066667     b
3  0.833333  1.000000     b
4  1.500000  1.833333     c

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 mozway