'How to apply a function to selected rows of a dataframe

I want to apply a regex function to selected rows in a dataframe. My solution works but the code is terribly long and I wonder if there is not a better, faster and more elegant way to solve this problem.

In words I want my regex function to be applied to elements of the source_value column, but only to rows where the column source_type == rhombus AND (rhombus_refer_to_odk_type == integer OR a decimal).

The code:

df_arrows.loc[(df_arrows['source_type']=='rhombus') & ((df_arrows['rhombus_refer_to_odk_type']=='integer') | (df_arrows['rhombus_refer_to_odk_type']=='decimal')),'source_value'] = df_arrows.loc[(df_arrows['source_type']=='rhombus') & ((df_arrows['rhombus_refer_to_odk_type']=='integer') | (df_arrows['rhombus_refer_to_odk_type']=='decimal')),'source_value'].apply(lambda x: re.sub(r'^[^<=>]+','', str(x)))

python pandas

Solution 1:^[1]

Use Series.isin with condition in variable m and for replace use Series.str.replace:

m = (df_arrows['source_type']=='rhombus') & 
     df_arrows['rhombus_refer_to_odk_type'].isin(['integer','decimal'])
df_arrows.loc[m,'source_value'] = df_arrows.loc[m,'source_value'].astype(str).str.replace(r'^[^<=>]+','')

EDIT: If mask is 2 dimensional possible problem should be duplicated columns names, you can test it:

 print ((df_arrows['source_type']=='rhombus'))
 print (df_arrows['rhombus_refer_to_odk_type'].isin(['integer','decimal']))

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1

'How to apply a function to selected rows of a dataframe

Solution 1:[1]

Sources

Related Questions

Solution 1:^[1]