'Combining DataFrames and filling 0s for missing data
I'm trying to merge many DataFrames. If user doesn't exist in any date's DataFrame, just keep the info of certain columns (e.g. user name) and set value of certain number type columns to 0.
df1 = pd.DataFrame({'user': ['A', 'B'],
'dt': ['2016-01-01', '2016-01-01'],
'userID': ['xxxa', 'yyyb'],
'val': [11, 22],
'val2': [111, 222]})
df2 = pd.DataFrame({'user': ['A', 'A', 'C'],
'dt': ['2016-02-13', '2016-02-13', '2016-02-13'],
'userID': ['xxxa', 'kkka', 'jjjc'],
'val': [33, 44, 55],
'val2': [333, 444, 555]})
DataFrame 1 on certain date:
dt user userID val val2 val3...
0 2016-01-01 A xxxa 11 ...
1 2016-01-01 B yyyb 22 ...
DataFrame 2 on another date:
dt user userID val val2 val3...
0 2016-02-13 A xxxa 33 ...
1 2016-02-13 A kkka 44 ...
2 2016-02-13 C jjjc 55 ...
Desired merged result:
dt user userID val val2 val3...
0 2016-01-01 A xxxa 11 ...
1 2016-02-13 A xxxa 33 ...
2 2016-01-01 A kkka 0 ...
3 2016-02-13 A kkka 44 ...
4 2016-01-01 B yyyb 22 ...
5 2016-02-13 B yyyb 0 ...
6 2016-01-01 C jjjc 0 ...
7 2016-02-13 C jjjc 55 ...
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
