'Creating a column based on strings from two columns and another boolean column
I'm trying to create a column based on some specific "rules". I'd like to have a new column at the end "Result" with the following result based on the three first columns:
| is_return | From | To | Result |
|---|---|---|---|
| True | Fir | Fem | FirFem |
| False | Tre | Syv | TreSyv |
| True | Syv | Tre | TreSyv_r |
| False | Tre | Syv | TreSyv2 |
| True | Syv | Tre | TreSyv_r2 |
| False | Snø | Van | SnøVan |
Basically if there's a trip that is not a return then just combine from and to, and add a number starting from 2 to it if there are multiple (row 4). If it's tagged as a return trip then first check if it exists as a non-return trip, example given in row 2 and 3. But if it's tagged as a return without there being a non-return variant then keep this format as is (row 1).
Solution 1:[1]
IIUC, you need several steps (commented in the code):
import numpy as np
# compute the string for both directions
s1 = df['From']+df['To']
s2 = df['To']+df['From']
# compute the string is the correct order
# depending on the existence of the first trip
s = pd.Series(np.where(s2.isin(s1[~df['is_return']]), s2+'_r', s1),
index=df.index)
# add number to duplicates
count = s.groupby(s).cumcount().add(1)
df['Result'] = s+np.where(count.gt(1), count.astype(str), '')
output:
is_return From To Result
0 True Fir Fem FirFem
1 False Tre Syv TreSyv
2 True Syv Tre TreSyv_r
3 False Tre Syv TreSyv2
4 True Syv Tre TreSyv_r2
5 False Snø Van SnøVan
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | mozway |
