'Concatenate Dataframes on common values yields NaN values for non-matches [python]

I am trying to merge/concatenate two dataframes on a common column and match up all the corresponding values. However, while matching values receive corresponding values for that row, when there is no match, a NaN value is produced. I am using python for this. I will explain here in more detail.

I have this dataframe A:

      ID     Area    Distance     Height       Temp
----------------------------------------------------
0    100     8.31           0       1.30      24.27
1    101     3.11           0       1.29      25.99
2    102     5.10           0       1.23      29.51
3    105     9.70           0       1.97      15.17
4    107     4.77           0       1.53      27.84
...

Each ID represents a different building footprint (polygon), with its Area Recorded, the height of the building, and the mean outdoor temperature recorded at the site of the building. The "Distance" column denotes the distance away from the building at which the temperature was recorded, and so onsite = 0 meters away.

And I have this dataframe B:

        ID      Temp    Distance
---------------------------------
 0     100     25.68           5
 1     100     26.05          10
 2     100     26.85          15
 3     100     27.25          20
 4     100     27.78          25
 5     101     22.68           5
 6     101     26.44          10
 7     101     26.83          15
 8     101     27.26          20
 9     101     28.38          25
10     102     25.63           5
11     102     26.26          10
12     102     26.57          15
13     102     26.91          20
14     102     28.84          25
15     105     25.33           5
16     105     26.25          10
17     105     26.54          15
18     105     26.23          20
19     105     27.53          25
20     107     25.23           5
21     107     26.73          10
22     107     26.26          15
23     107     26.11          20
24     107     27.16          25
...

This shows for the same building IDs the temperatures recorded at different distances away from the building, and so for each building I want the recorded mean outdoor temperature 5 meters away, 10 meters away, 15 meters away, 20 meters away, and 25 meters away.

What I want to do is join dataframes A and B by the common "ID" column. And so what I want to do is produce a dataframe C that shows for each ID that buildings Temperature at Distances 0, 5, 10, 15, 20, and 25. The issue is that I want for each building ID the Area and Height to remain the same, because of course the buildings area and height will not change! And so I want to produce the following dataframe C:

       ID     Area    Distance     Height       Temp
----------------------------------------------------
 0    100     8.31           0       1.30      24.27
 1    100     8.31           5       1.30      25.68
 2    100     8.31          10       1.30      26.05
 3    100     8.31          15       1.30      26.85    
 4    100     8.31          20       1.30      27.25    
 5    100     8.31          25       1.30      27.78
 6    101     3.11           0       1.29      25.99
 7    101     3.11           5       1.29      22.68    
 8    101     3.11          10       1.29      26.44    
 9    101     3.11          15       1.29      26.83    
10    101     3.11          20       1.29      27.26    
11    101     3.11          25       1.29      28.38    
12    102     5.10           0       1.23      29.51
13    102     5.10           5       1.23      25.63    
14    102     5.10          10       1.23      26.26    
15    102     5.10          15       1.23      26.57    
16    102     5.10          20       1.23      26.91     
17    102     5.10          25       1.23      28.84
18    105     9.70           0       1.97      15.17
19    105     9.70           5       1.97      25.33
20    105     9.70          10       1.97      26.25
21    105     9.70          15       1.97      26.54
22    105     9.70          20       1.97      26.23
23    105     9.70          25       1.97      27.53
24    107     4.77           0       1.53      27.84
25    107     4.77           5       1.53      25.23
26    107     4.77          10       1.53      26.73
27    107     4.77          15       1.53      26.26
28    107     4.77          20       1.53      26.11
29    107     4.77          25       1.53      27.16
...

And so to obtain this I try the following, trying to concatenate dataframes A and B on the "ID" column, and then sorting the rows by "ID" and "Distance":

df_C =  pd.concat([df_A, df_B]).sort_values(["ID", "Distance"]).reset_index(drop=True)

However this yields:

       ID     Area    Distance     Height       Temp
----------------------------------------------------
 0    100     8.31           0       1.30      24.27
 1    100      NaN           5        NaN      25.68
 2    100      NaN          10        NaN      26.05
 3    100      NaN          15        NaN      26.85    
 4    100      NaN          20        NaN      27.25    
 5    100      NaN          25        NaN      27.78
 6    101     3.11           0       1.29      25.99
 7    101      NaN           5        NaN      22.68    
 8    101      NaN          10        NaN      26.44    
 9    101      NaN          15        NaN      26.83    
10    101      NaN          20        NaN      27.26    
11    101      NaN          25        NaN      28.38    
12    102     5.10           0       1.23      29.51
13    102      NaN           5        NaN      25.63    
14    102      NaN          10        NaN      26.26    
15    102      NaN          15        NaN      26.57    
16    102      NaN          20        NaN      26.91     
17    102      NaN          25        NaN      28.84
18    105     9.70           0        1.97     15.17
19    105      NaN           5        NaN      25.33
20    105      NaN          10        NaN      26.25
21    105      NaN          15        NaN      26.54
22    105      NaN          20        NaN      26.23
23    105      NaN          25        NaN      27.53
24    107     4.77           0       1.53      27.84
25    107      NaN           5        NaN      25.23
26    107      NaN          10        NaN      26.73
27    107      NaN          15        NaN      26.26
28    107      NaN          20        NaN      26.11
29    107      NaN          25        NaN      27.16
...

And so it appears that the Area and Height values are not getting matched up because Dataframe B does not contain the corresponding Area and Height Values, and so there is nothing to report there when I merge the two dataframes. How can I fix this issue so that I get my intended Dataframe C?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source