'How to remove constant prefix and suffix character [duplicate]
I have a data frame where numeric data is stored in String with some Prefix character which I need to remove. On top of this it has double quotes inside the quotes i.e. ' "" '.
dict_1 = {"Col1" : [1001, 1002, 1003, 1004, 1005],
"Col2" : ['"Rs. 5131"', '"Rs. 0"', '"Rs 351157"', '"Rs 535391"', '"Rs. 6513"']}
a = pd.DataFrame(dict_1)
a.head(6)
| | Col1 | Col2 |
|----|----------|-------------|
| 0 |1001 |"Rs. 5131" |
| 1 |1002 |"Rs. 0" |
| 2 |1003 |"Rs 351157" |
| 3 |1004 |"Rs 535391" |
| 4 |1005 |"Rs. 6513" |
As you can see I want to remove Quotes defined inside Col2 and along with this I have to remove Rs.
I tried following code to subset
b = a['Col2'][0]
b = b[5:]
b = b[:-1]
b
But the issue in some observation it is defined as Rs. and in some Rs without period.
The result should be a column of integers.
Solution 1:[1]
Given the sample data in the OP, use .replace
a['Col2'] = a['Col2'].replace({'"': ''}, regex=True)
a['Col2'] = a['Col2'].replace({'Rs.': ''}, regex=True)
a['Col2'] = a['Col2'].replace({'Rs': ''}, regex=True)
a['Col2'] = a['Col2'].replace({' ': ''}, regex=True)
Solution 2:[2]
You can simply use removeprefix and removesuffix methods for string after you get the value of the particular columns For a complete answer as comments are demanding
col3=[]
lis = dic['col2']
for b in lis:
b=b.removeprefix('"').removesuffix('"').removeprefix("Rs.").removeprefix("Rs ")
col3.append(int(b))
dic['col2']=col3
By this even if there will be Rs. with a period or without period both will be removed without any error. Edit: Change suggested by @Jhanzaib Humayun. I found an easier answer out there on this link for whole of the series alltogether extract number from string
Solution 3:[3]
Or use .str.replace():
a["Col2"] = a["Col2"].str.replace('Rs. ', '').replace('"', '')
Update use replace:
a["Col2"].replace(r"Rs\.?\s+", '', regex=True, inplace=True).astype(int)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Trenton McKinney |
Solution 2 | |
Solution 3 |