'How to use delimiter on column where some instances do not have the delimiter, python

I have a column called sub, that I want to split into sub_1 and sub_2.

sub example looks like this:

banana/apple, 
banana,
apple

where some records have the delimiter '/' and some don't.

What I am trying to do is: if there is a delimiter like in the first example above, split so: sub_1 -> banana and sub_2 -> apple

if there is no delimiter like in the second two examples, then it would look like: sub_1 -> banana sub_1 -> apple

I tried this code:

df[['sub_1', 'sub_2']] = df['sub'].str.split('/', expand=True)

However I get this error:

ValueError: Columns must be same length as key

which I am guessing is because how some columns do not have a delimiter, wondering if there is a quick way to default that issue to the first column if anyone here has run into this issue before.

Thanks for any direction.



Solution 1:[1]

Assuming your dataframe looks like this:

>>> df
            sub
0  banana/apple
1        banana
2         apple

Then you just need to split with expand:

>>> df[["sub_1", "sub_2"]] = df["sub"].str.split("/", expand=True)

>>> df
            sub   sub_1  sub_2
0  banana/apple  banana  apple
1        banana  banana   None
2         apple   apple   None

Or, using pd.Series, if you want NaNs:

>>> df["sub"].str.split("/").apply(pd.Series)
            sub   sub_1  sub_2
0  banana/apple  banana  apple
1        banana  banana    NaN
2         apple   apple    NaN

Solution 2:[2]

Try this:

df = pd.DataFrame({"sub": ["banana/apple", "banana", "apple"]})

res = df.join(df.pop("sub").str.split("/", expand=True))
res.columns = ["sub_1", "sub_2"]

print(res)

    sub_1  sub_2
0  banana  apple
1  banana   None
2   apple   None

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 ddejohn
Solution 2 Rabinzel