'How to handle duplicate column names when AsTable is used to separate a NamedTuple column

I have a DataFrame with 3 columns, where the last column nt is a NamedTuple type:

julia> df = DataFrame(:a => [1,2], :b => [3,4], :nt => [(a=10,b=20),(a=30,b=40)])
2×3 DataFrame
 Row │ a      b      nt
     │ Int64  Int64  NamedTup…
─────┼────────────────────────────────
   1 │     1      3  (a = 10, b = 20)
   2 │     2      4  (a = 30, b = 40)

I want to separate the NamedTuple fields into their respective columns. So I use this:

julia> transform(df, :nt => AsTable)
2×3 DataFrame
 Row │ a      b      nt
     │ Int64  Int64  NamedTup…
─────┼────────────────────────────────
   1 │    10     20  (a = 10, b = 20)
   2 │    30     40  (a = 30, b = 40)

Problem: all the initial values of the first and second columns, a and b were overwritten by the NamedTuple values because of column name duplications.

Is there an easy and quick manner to prevent values of existing columns to be overwritten, so that I can get something such as this:

2×3 DataFrame
 Row │ a      b      nt_a   nt_b
     │ Int64  Int64  Int64  Int64
─────┼────────────────────────────────
   1 │     1      3     10     20
   2 │     2      4     30     40

Any pointers are much appreciated.



Solution 1:[1]

Try this:

julia> transform(df, :a => :a1, :b => :b1, :nt => AsTable)
2×5 DataFrame
 Row ? a      b      nt                a1     b1
     ? Int64  Int64  NamedTup…         Int64  Int64
????????????????????????????????????????????????????
   1 ?    10     20  (a = 10, b = 20)      1      3
   2 ?    30     40  (a = 30, b = 40)      2      4

If you really need the resulting frame to be just exactly as your hypothetical example, I think you need to do a transform on the transform, since I don't think the named tuple names can be changed as they are brought in as a table:

julia> transform(transform(df, :a => :nt_a, :b => :nt_b, :nt =>
 AsTable)[!, [:a, :b, :nt_a, :nt_b]], :a => :nt_a, :b => :nt_b, 
:nt_a => :a, :nt_b => :b)

2×4 DataFrame
 Row ? a      b      nt_a   nt_b
     ? Int64  Int64  Int64  Int64
??????????????????????????????????
   1 ?     1      3     10     20
   2 ?     2      4     30     40

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1