'How can I fill empty ID cells with the IDs from either one of two other DataFrames
I have the following table in which some (but not all) user IDs are missing:
| user ID | item ID | item type |
|---|---|---|
| 10 | 123 | question |
| NaN | 126 | answer |
| 14 | 129 | question |
To get the missing user IDs I want to look up the corresponding item ID in the following two tables (depending on whether the item type is a question or an answer in the table above).
answer DataFrame:
| item ID | user ID |
|---|---|
| 126 | 12 |
question DataFrame:
| item ID | user ID |
|---|---|
| 123 | 10 |
| 129 | 14 |
Finally, I want to get something like this:
| user ID | item ID | item type |
|---|---|---|
| 10 | 123 | question |
| 12 | 126 | answer |
| 14 | 129 | question |
Solution 1:[1]
Try with numpy.select:
import numpy as np
conditions = [df["user ID"].isnull() & df["item type"].eq("question"),
df["user ID"].isnull() & df["item type"].eq("answer")]
choices = [df["item ID"].map(dict(zip(question["item ID"],question["user ID"]))),
df["item ID"].map(dict(zip(answer["item ID"],question["user ID"])))]
df["user ID"] = np.select(conditions, choices, df["user ID"])
>>> df
user ID item ID item type
0 10.0 123 question
1 10.0 126 answer
2 14.0 129 question
Solution 2:[2]
You can use a np.where() and a merge to get the data you need
df['user ID'] = df['user ID'].fillna(0).astype(int)
df_final = pd.merge(left = df, right = answer_df, on = 'item ID', how = 'outer', suffixes = ('', '_right'))
df_final['user ID'] = np.where(df_final['user ID'] == 0, df_final['user ID_right'], df_final['user ID']).astype(int)
df_final[['user ID', 'item ID', 'item type']]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | not_speshal |
| Solution 2 | ArchAngelPwn |
