'I want to do a conditionning filter in some row to extract values

I have a DataFrame with the first columns :

Brand           Price
Samsung         75.5,99.2
Lenovo          50.3
Lg              60.5,88.4

I want to extract the max value if the cell contains 2 values. I try the conditioning that follows:

for x in df['Price']:
 if 0 <= len(x) <= 5:
  df['Price'] = df['Price'].str[0:5]
 else:
  df['Price'] = df['Price'].str[6:12]

but it don't work for the cells that contains a single value it returns a blank cell.



Solution 1:[1]

We can str.split the column on the comma, then use apply to find the max value per row (each string needs converted to a float so that max will behave as expected):

df['Price'] = df['Price'].str.split(',').apply(lambda c: max(map(float, c)))

Alternatively expand into a DataFrame and convert the entire DataFrame to float with astype and get the max per row (axis=1):

df['Price'] = df['Price'].str.split(',', expand=True).astype(float).max(axis=1)

Assuming the comma separated values are sorted (the last value will always be the largest), we can also just use the str accessor to get the last value (index -1):

df['Price'] = df['Price'].str.split(',').str[-1]

All options produces output df (for the provide sample):

     Brand  Price
0  Samsung   99.2
1   Lenovo   50.3
2       Lg   88.4

Setup:

import pandas as pd

df = pd.DataFrame({
    'Brand': ['Samsung', 'Lenovo', 'Lg'],
    'Price': ['75.5,99.2', '50.3', '60.5,88.4']
})

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1