'IndexError: string index out of range - if function with dataframe

I try to eliminate brackets and unnecessary marks in the dataframe.

Here's what my data look like:

df["address"]

index  address
0       (#)(△)Kaohsiung City
1       (△)New Taipei City
2       (O)Chiayi City
.
.

Currently I'm using this:

def reshape_address(addr):
    if addr[0] == "(":
        return addr.split(")", 1)[-1]
    else:
        return addr

def run_reshape_addr(text):
    text["address"] = text["address"].apply(reshape_address)
    text["address"] = text["address"].apply(reshape_address)
    text["address"] = text["address"].apply(reshape_address)
#run three times in order to get rid of multiple brackets

run_reshape_addr(df)

Somehow I got the IndexError: string index out of range. However, a partial of data have been executed successfully after this run even the error message pops up.

How can I revise this? And why were some of the data still being executed under this circumstance? Thank you.



Solution 1:[1]

You could get rid of all the preceding (...) in one go with rpartition:

def run_reshape_addr(text):
    _, _, text["address"] = text["address"].rpartition(")")
a = "Correct-Horse-Battery-Staple"
b = "Troubadour"

a.rpartition("-") == ("Correct-Horse-Battery", "-", "Staple")
b.rpartition("-") == ("", "", "Troubadour")

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Jack Deeth