'How to concatenate two columns ignoring NaN?
I have a dataframe:
id1 id2
a NaN
b c
d e
I want to create new columns ids as a concatenation of id1 and id2:
df.ids = df.id1 + "-" + df.id2
But result is:
id1 id2 ids
a NaN NaN
b c b-c
d e d-e
As you see concatenation of a and NaN is NaN, however, I would like to get a-NaN. How to do that? desired result:
id1 id2 ids
a NaN a-NaN
b c b-c
d e d-e
Solution 1:[1]
import numpy as np
import pandas as pd
df = pd.DataFrame({"id1":['a', 'b', 'c'], "id2":[np.nan, 'c', 'e']})
df['ids'] = df.id1.replace(np.nan, 'NaN') + "-" + df.id2.replace(np.nan, 'NaN')
print(df)
> id1 id2 ids
> 0 a NaN a-NaN
> 1 b c b-c
> 2 c e c-e
I think you can replace the np.nan before the sum.
Solution 2:[2]
df = pd.DataFrame({'id1': ['a', 'b', 'c'], 'id2': [np.NaN, 'c', 'e']})
df['id2'] = df['id2'].astype(str)
df['ids'] = df['id1'] + "-" + df['id2']
df
Result:
id1 id2 ids
0 a nan a-nan
1 b c b-c
2 c e c-e
Alternatively, use:
df['id2'] = df['id2'].fillna('NaN')
instead of the type conversion. This way, you get to pick the exact string (including capitalization) you want for the NaNs.
Solution 3:[3]
This will work:
df['ids'] = df['id1'].astype(str) + "-" + df['id2'].astype(str)
Output:
id1 id2 ids
0 a NaN a-nan
1 b c b-c
2 d e d-e
Another way would be to first convert NaN to string by using .fillna():
df['id2'].fillna('NaN', inplace = True)
df['ids'] = df['id1'] + "-" + df['id2']
Output:
id1 id2 ids
0 a NaN a-NaN
1 b c b-c
2 d e d-e
Solution 4:[4]
Given a number a, I would say that you cannot perform a-NaN, as NaN stands for "Not a Number".
If you just want to skip NaN as if they were 0s', you'll need to filter them (as done in this other case).
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | PinkR1ver |
| Solution 2 | 9769953 |
| Solution 3 | |
| Solution 4 |
