'Replace the value of a column which is close match to another column in pandas

There is a dataframe as follows:

ID   Mat_Des                             Matched_Des                         Price    Score
1    4-STROKE 25HP OB MOTOR FOR GEMINI   4- STROKE 25 HP OB MOTER FOR GEMNI  10000      100
2    OBS for 25HP OBM                    STANDARD TOOL KIT                    5000       94
3    Accessories for 25HP OBM            SERVICE ENGINEERING                  5000       54
4    Standard Tool Kit for 25HP OBM      PBS DOCUMENTATION                    1000
5    OWNER’S MANUAL (IN ENGLISH)

The Score is derivation from a fuzzy matching logic which matches Mat_Des and Matched_Des using fuzz.partial_ratio and set threshold to 85. I want a resultant dataframe where Matched_Des column would be dropped. But the Price would be plotted accordingly. So the resultant dataframe would be

ID   Mat_Des                               Price
1    4-STROKE 25HP OB MOTOR FOR GEMINI     10000
2    OBS for 25HP OBM                       5000
3    Accessories for 25HP OBM                 0               
4    Standard Tool Kit for 25HP OBM         5000    
5    OWNER’S MANUAL (IN ENGLISH)              0

Please note that for Mat_Des "Standard Tool Kit for 25HP OBM" the Price is plotted as 5000. Because it was for "Standard Tool Kit" under Matched_Des.

To start I want to use a similar approach like:

 df['Mat_Des'] = np.where(df['Score']>85, df['Matched_Des'],df['Mat_Des'])

But the above approach would replace OBS for 25HP OBM by STANDARD TOOL KIT.

Any clue on how to address this?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source