'Concatenate data frames over finite index otherwise start a new column - pandas

I need to add new data to the last column of a data-frame, if this has any empty cells, or create a new column otherwise. I wonder if there is any pythonic way to achieve this through pandas functionalities (e.g. concact, join, merge, etc.). The example is as follows:

import numpy as np
import pandas as pd

df1 = pd.DataFrame({'0':[8, 9, 3, 5, 0], '1':[9, 6, 6, np.nan, np.nan]})

df2 = pd.DataFrame({'2':[2, 9, 4]}, index = [3,4,0])

desired_output = pd.DataFrame({'0':[8, 9, 3, 5, 0],
                               '1':[9, 6, 6, 2, 9],
                               '2':[4, np.nan, np.nan, np.nan, np.nan]})
# df1
   0  1
0  8  9
1  9  6
2  3  6
3  5  NaN
4  0  NaN

# df 2
   2
3  2
4  9
0  4

# desired_output
   0  1  2
0  8  9  4
1  9  6  NaN
2  3  6  NaN
3  5  2  NaN
4  0  9  NaN


Solution 1:[1]

Your problem can be broken down into 2 steps:

  1. Contenate df1 and df2 based on their indexes.
  2. For each row of the concatenated dataframe, move the nan to the end.

Try this:

# Step 1: concatenate the two dataframes
result = pd.concat([df1, df2], axis=1)

# Step 2a: for each row, sort the elements based on their nan status
# For example: sort [1, 2, nan, 3] based on [False, False, True, False]
#              np.argsort will return [0, 1, 3, 2]
# Stable sort is critical here since we don't want to swap elements whose
# sort keys are equal.
arr = result.to_numpy()
idx = np.argsort(np.isnan(arr), kind="stable")

# Step 2b: reconstruct the result dataframe based on the sort order
result = pd.DataFrame(np.take_along_axis(arr, idx, axis=1), columns=result.columns)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Code Different