'Sum of list values in a df, new column, values are objects
I have a df made of values from a dictionary. I can get rid of [], ',' and split it all in different cols (one col per number). But can't make the transfer to float, tried different ways. I'm looking for new cols with sum of values for each []. Used (df1['all'][0]) and got the numbers like in column 'all'.
import pandas as pd
data = {'all':['[0.75, 0.34, 0.91, 0.12, 0.5],[0.54, 0.65, 0.5, 0.79, 0.91],[0.81, 0.77, 0.82, 0.66, 0.38],[0.0, 0.78, 0.87, 0.81, 0.67],[0.0, 0.0, 0.56, 0.44, 0.0]']
}
df1 = pd.DataFrame(data)
print(df1)
Preferred output
all all1 all2 all3 all4 all5
0 [0.75, 0.34..... 2.62 3.39 3.44 3.13 1.0
Solution 1:[1]
Try:
from ast import literal_eval
df_out = pd.DataFrame(
[
{"all": orig} | {f"all{i}": sum(l) for i, l in enumerate(row, 1)}
for orig, row in zip(df1["all"], df1["all"].apply(literal_eval))
]
)
print(df_out)
Prints:
all all1 all2 all3 all4 all5
0 [0.75, 0.34, 0.91, 0.12, 0.5],[0.54, 0.65, 0.5, 0.79, 0.91],[0.81, 0.77, 0.82, 0.66, 0.38],[0.0, 0.78, 0.87, 0.81, 0.67],[0.0, 0.0, 0.56, 0.44, 0.0] 2.62 3.39 3.44 3.13 1.0
Solution 2:[2]
This is a modification of @Andrej Kesely answer. This modification works for Python <3.9 and does not drop other columns that are contained in df1.
from ast import literal_eval
df_out = pd.concat(
[
df1,
pd.DataFrame(
[
{**{f"all{i}": sum(l) for i, l in enumerate(row, 1)}}
for row in df1["all"].apply(literal_eval)
]
)
],
axis=1
)
print(df_out)
col1 col2 all all1 all2 all3 all4 all5
0 A C [0.75, 0.34, 0.91, 0.12, 0.5],[0.54, 0.65, 0.5... 2.62 3.39 3.44 3.13 1.0
1 B D [0.75, 0.34, 0.91, 0.12, 0.5],[0.54, 0.65, 0.5... 2.62 3.39 3.44 3.13 1.0
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Andrej Kesely |
| Solution 2 |
