'Using Julia to expand a list of dicts fields in dataframe
I have a column in a dataframe that is a list of dictionaries:
[{"key1": value1, "key2": "value2", "key3": "value3", "key4": "value4"}, {"key1": value1, "key2": "value2", "key3": "value3", "key4": "value4"}]
Is there a way to expand this column to get something like this:
key1 key2 key3 key4
value1 value2 value3 value4
value1 value2 value3 value4
Note: key_ can be any string, value_ can be any value.
Solution 1:[1]
This is quite easy:
julia> df = DataFrame(col=[Dict("key1"=>"value1", "key2"=>"value2", "key3"=>"value3", "key4"=>"value4"),
Dict("key1"=>"value1", "key2"=>"value2", "key3"=>"value3", "key4"=>"value4")])
2×1 DataFrame
Row ? col
? Dict…
?????????????????????????????????????????
1 ? Dict("key2"=>"value2", "key3"=>"…
2 ? Dict("key2"=>"value2", "key3"=>"…
julia> select(df, :col => AsTable)
2×4 DataFrame
Row ? key2 key3 key1 key4
? String String String String
??????????????????????????????????????
1 ? value2 value3 value1 value4
2 ? value2 value3 value1 value4
The only limitation is that the order of resulting columns is undefined as Dict does not guarantee key order. You would need to re-order them in the second step e.g. like this:
julia> select(select(df, :col => AsTable), string.("key", 1:4))
2×4 DataFrame
Row ? key1 key2 key3 key4
? String String String String
??????????????????????????????????????
1 ? value1 value2 value3 value4
2 ? value1 value2 value3 value4
Another approach would be:
julia> select(df, :col .=> [ByRow(x -> x["key$i"]) => "key$i" for i in 1:4])
2×4 DataFrame
Row ? key1 key2 key3 key4
? String String String String
??????????????????????????????????????
1 ? value1 value2 value3 value4
2 ? value1 value2 value3 value4
it is a bit more complex conceptually and requires you to know what keys you want to extract, but the benefit is that you do the operation in one shot.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Bogumił Kamiński |
