'How to fill a df column with range of values of 2 columns from another df?

Df1

A   B  C1  C2   D   E
a1  b1  2   4   d1  e1
a2  b2  1   2   d2  e2

Df2

A   B   C   D   E
a1  b1  2   d1  e1
a1  b1  3   d1  e1
a1  b1  4   d1  e1
a2  b2  1   d2  e2
a2  b2  2   d2  e2

How to make Df2 from Df1 in the fastest possible way? I tried using groupby and then within for loop used np.arange to fill Df2.C and then used pd.concat to make the final Df2. But this approach is very slow and doesn't seem very elegant and pythonic as well. Can somebody please help with this problem.



Solution 1:[1]

Try this:

df1.assign(C = [np.arange(s, e+1) for s, e in zip(df1['C1'], df1['C2'])])\
   .explode('C')

Output:

    A   B  C1  C2   D   E  C
0  a1  b1   2   4  d1  e1  2
0  a1  b1   2   4  d1  e1  3
0  a1  b1   2   4  d1  e1  4
1  a2  b2   1   2  d2  e2  1
1  a2  b2   1   2  d2  e2  2

Solution 2:[2]

One way is to melt df1, use groupby.apply to add ranges; then explode for the final output:

cols = ['A','B','D','E']
out = (df1.melt(cols, value_name='C').groupby(cols)['C']
       .apply(lambda x: range(x.min(), x.max()+1))
       .explode().reset_index(name='C'))

Output:

    A   B   D   E  C
0  a1  b1  d1  e1  2
1  a1  b1  d1  e1  3
2  a1  b1  d1  e1  4
3  a2  b2  d2  e2  1
4  a2  b2  d2  e2  2

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Scott Boston
Solution 2