'IndexError: string index out of range - if function with dataframe
I try to eliminate brackets and unnecessary marks in the dataframe.
Here's what my data look like:
df["address"]
index address
0 (#)(△)Kaohsiung City
1 (△)New Taipei City
2 (O)Chiayi City
.
.
Currently I'm using this:
def reshape_address(addr):
if addr[0] == "(":
return addr.split(")", 1)[-1]
else:
return addr
def run_reshape_addr(text):
text["address"] = text["address"].apply(reshape_address)
text["address"] = text["address"].apply(reshape_address)
text["address"] = text["address"].apply(reshape_address)
#run three times in order to get rid of multiple brackets
run_reshape_addr(df)
Somehow I got the IndexError: string index out of range. However, a partial of data have been executed successfully after this run even the error message pops up.
How can I revise this? And why were some of the data still being executed under this circumstance? Thank you.
Solution 1:[1]
You could get rid of all the preceding (...) in one go with rpartition:
def run_reshape_addr(text):
_, _, text["address"] = text["address"].rpartition(")")
a = "Correct-Horse-Battery-Staple"
b = "Troubadour"
a.rpartition("-") == ("Correct-Horse-Battery", "-", "Staple")
b.rpartition("-") == ("", "", "Troubadour")
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Jack Deeth |
