'How to create a new column based on conditions in the existing columns in a dataframe in python?
I have a dataframe like the below:
df:
PAN_NO COST_VALUE
AAA -0.001
BBB 2938080
CCC 49224091
DDD 100
EEE 50236272.32
I am trying to create a new column based on the below condition:
If df['cost_value'] >=0.001 and df['cost_value'] <= 299985.0 then cost_value_group should be 1
If df['cost_value'] > 299985.0 and df['cost_value'] <= 2938082.40 then cost_value_group should be 2
If df['cost_value'] > 2938082.40 and df['cost_value'] <= 17399130.0 then cost_value_group should be 3
If df['cost_value'] > 2938082.40 and df['cost_value'] <= 17399130.0 then cost_value_group should be 3
If df['cost_value'] > 17399130.0 and df['cost_value'] <= 49224091.375 then cost_value_group should be 4
If df['cost_value'] > 49224091.375 cost_value_group should be 5
Else it should be 6
EXPECTED OUTPUT:
PAN_NO COST_VALUE COST_VALUE_Group
AAA -0.001 1
BBB 2938080 2
CCC 49224091 5
DDD 100 1
EEE 50236272.32 6
I tried doing :
def cost_value(x):
if df['cost_value'] >= -0.001 and df['cost_value'] <= 299985.0:
return 1
elif df['cost_value'] > 299985.0 and df['cost_value'] <= 2938082.40:
return 2
elif df['cost_value'] > 2938082.40 and df['cost_value'] <= 17399130.0:
return 3
elif df['cost_value'] > 17399130.0 and df['cost_value'] <= 49224091.375:
return 4
elif df['cost_value'] > 49224091.375:
return 5
else:
return 6
df['cost_value_group] = df['cost_value].apply(cost_value)
I am getting a value error that the true value of a series is ambiguous.
Can someone please assist me in this.
Solution 1:[1]
You are on the right path.
Try this:
df['cost_value_group'] = df['cost_value'].apply(lambda x: cost_value(x))
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | tibipin |
