'Pandas: Remove all characters before a specific character (last specific character) in a dataframe column that specific character is repeated 4 times
Here is a dataframe constructed with a header column "ParentPath"
data = {'ParentPath': ['Hi \ All \ First Name \ Last Name \ A \ 200', 'Hi \ All \ First Name \ Middle Name \ Last Name \ B \ 33', 'Hi \ All \ First Name \ C \ 199', 'Hi \ All \ First Name \ D \ 333', 'Hi \ All \ First Name \ E \ 12', 'Hi \ All \ F \ 88']}
df = pd.DataFrame(data)
ParentPath
0 Hi \ All \ First Name \ Last Name \ A \ 200
1 Hi \ All \ First Name \ Middle Name \ Last Name \ B \ 33
2 Hi \ All \ First Name \ C \ 199
3 Hi \ All \ First Name \ D \ 333
4 Hi \ All \ First Name \ E \ 12
5 Hi \ All \ F \ 88
Output needed as shown below after removing all characters after the last " \ " keep in mind there is a space after and before each "backslash"
ParentPath
0 Hi \ All \ First Name \ Last Name \ A
1 Hi \ All \ First Name \ Middle Name \ Last Name \ B
2 Hi \ All \ First Name \ C
3 Hi \ All \ First Name \ D
4 Hi \ All \ First Name \ E
5 Hi \ All \ F
Solution 1:[1]
Try splitting and then joining:
df['ParentPath'] = df['ParentPath'].str.split(' \\\\ ').str[:-1].str.join(' \\ ')
Output:
ParentPath
0 Hi \ All \ First Name \ Last Name \ A
1 Hi \ All \ First Name \ Middle Name \ Last Name \ B
2 Hi \ All \ First Name \ C
3 Hi \ All \ First Name \ D
4 Hi \ All \ First Name \ E
5 Hi \ All \ F
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
