'Pandas multiple conditions within a single column
In the below df, I need to replace the COST A & COST B for E to 0 and replace Comment as Un reported cost, when below conditions are met -
- E and F have the same cost for 'COST A'
- E and F have the same cost for 'COST B'
as you can see 20 and 0.5 for E is replace with 0, as E and F have the same cost
df = pd.DataFrame([['1/1/2021','SKU_1','A','0','0','Un reported cost'],
['1/1/2021','SKU_1','B','0','0','Un reported cost'],
['1/1/2021','SKU_1','C','0','0','Un reported cost'],
['1/1/2021','SKU_1','D','0','0','Un reported cost'],
['1/1/2021','SKU_1','E','0.05','20','Calculated'],
['1/1/2021','SKU_1','F','0.05','20','Actual']],
columns = ['MTH-YR','SKU','TYPE','COST A','COST B','COMMENT'])
Expected result,
MTH-YR SKU TYPE COST A COST B COMMENT
0 1/1/2021 SKU_1 A 0 0 Un reported cost
1 1/1/2021 SKU_1 B 0 0 Un reported cost
2 1/1/2021 SKU_1 C 0 0 Un reported cost
3 1/1/2021 SKU_1 D 0 0 Un reported cost
4 1/1/2021 SKU_1 E 0 0 Un reported cost
5 1/1/2021 SKU_1 F 0.5 20 Actual
Solution 1:[1]
If this is just for this small dataset, use .iloc method.
## Change COST B for E to 0
df.iloc[4,4] = 0
## Change COMMENT for E to 'Un reported cost'
df.iloc[4,5] = 'Un reported cost'
UPDATE: I was playing around with this before going to sleep, and came across this solution on a larger data.
Let's define your data. I have added duplicate E's here.
## Create dataframe
df = pd.DataFrame([['1/1/2021','SKU_1','A','0','0','Un reported cost'],
['1/1/2021','SKU_1','B','0','0','Un reported cost'],
['1/1/2021','SKU_1','C','0','0','Un reported cost'],
['1/1/2021','SKU_1','D','0','0','Un reported cost'],
['1/1/2021','SKU_1','E','0.05','20','Calculated'],
['1/1/2021','SKU_1','E','0.05','20','Calculated'],
['1/1/2021','SKU_1','E','0.05','20','Calculated'],
['1/1/2021','SKU_1','F','0.05','20','Actual']],
columns = ['MTH-YR','SKU','TYPE','COST A','COST B','COMMENT'])
Using list comprehension, we can assign values to COST B based on the conditions that you need.
df['COST B'] = ['0' if df['TYPE'][x] == 'E' else df['COST B'][x] for x in range(len(df['TYPE']))]
In essence, we're just looping through the values in TYPE to see which ones are type 'E'. When found, we're changing that value to 0. Otherwise, we're keeping the value stored in COST B. Same logic is applied for COMMENT.
df['COMMENT'] = ['Un reported cost' if df['TYPE'][x] == 'E' else df['COMMENT'][x] for x in range(len(df['TYPE']))]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
