'fill a column with its most frequent

I want to fill a column 'col2' with its most frequent value grouped by some other column.However, it should not affect other columns of the dataframe.

import pandas as pd
d = {'col1': ['green','green','green','blue','blue','blue'],'col2': ['gx','gx','ow','nb','nb','mj'],'col3': ['omg','omg','omg','qwe','qwe','omg'],'col4':['s','u','s','s','u','u']}
dftest = pd.DataFrame(data=d)
dftest

I ran below code which is working for col1 and col2 but no idea how to keep other columns intact.

dftest = dftest.groupby('col1')['col2'].apply(lambda x: x.value_counts().index[0]).reset_index()

Expected dataframe:

col1 col2 col3 col4
green gx omg s
green gx omg u
green gx omg s
blue gx qwe s
blue gx qwe u
blue gx omg u


Solution 1:[1]

Your expected output appears slightly off since blue has a different most seen string in the original DataFrame. The following code should get you the desired output.

dftest.assign(
    col2=dftest.groupby("col1", as_index=False)["col2"].transform(
        lambda x: x.value_counts().idxmax()
    )
)

    col1 col2 col3 col4
0  green   gx  omg    s
1  green   gx  omg    u
2  green   gx  omg    s
3   blue   nb  qwe    s
4   blue   nb  qwe    u
5   blue   nb  omg    u

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 gold_cy