'How do I extract multiple values from each row of a DataFrame in Python?
I have a DataFrame df2 with columns currencies and c_codes.
Each row in the currencies column is a list of one or more dictionaries. I want to extract the value for the key code in each dictionary within each list in each row of currencies and transfer the code values to a different column of the DataFrame c_codes
Case in point:

From df2['currencies'][0] I want to extract the code value AFN and transfer it to df2['c_codes'][0]
Similarly, if there are multiple code values for a row, such as df2['currencies'][8] then I want to extract a list of code values ['AUD','GBP'] and transfer the list to df2['c_codes][8]
Each entry in c_codes can be a list for this purpose.
Here's my code:

I have tried using a loop to grab the code values in each dict. in each row and append them to a list temp. Then append the list temp to a bigger list list_of_currencies so I get a list with lists of codes corresponding to each row. Then I clear the temp list so it can grab the next row of code and so on.
However, the code returns the list_of_currencies as empty. I have tried playing around with the looping, temp, lists, etc. but it just returns empty list or else a list of all appended codes without sub-lists.
I want a list returned with sub-lists of codes so I can assign each sub-list to a corresponding row in c_codes column.
What am I doing wrong? Is there a simpler way to do this?
Solution 1:[1]
I think you can use explode to expand the list of objects, and .str to get the code value, for a vectorized (read: really fast) solution:
df['c_codes'] = df['currencies'].explode().str['code'].groupby(level=0).agg(list).str.join(', ')
Output (based on sample data at the bottom of this answer):
>>> df
currencies c_codes
0 [{'code': 'AFN', 'name': 'Afghan afghani'}] AFN
1 [{'code': 'EUR', 'name': 'Euro'}] EUR
2 [{'code': 'AUD', 'name': 'Australian dollar'}, {'code': 'GBP', 'name': 'British pound'}] AUD, GBP
3 [{'code': 'XCD', 'name': 'East Carribean dollar'}] XCD
If you want actual lists in the c_codes column and not just comma-separate strings, just omit the .str.join part:
df['c_codes'] = df['currencies'].explode().str['code'].groupby(level=0).agg(list)
Output:
>>> df
currencies c_codes
0 [{'code': 'AFN', 'name': 'Afghan afghani'}] [AFN]
1 [{'code': 'EUR', 'name': 'Euro'}] [EUR]
2 [{'code': 'AUD', 'name': 'Australian dollar'}, {'code': 'GBP', 'name': 'British pound'}] [AUD, GBP]
3 [{'code': 'XCD', 'name': 'East Carribean dollar'}] [XCD]
Sample data for testing above code:
df = pd.DataFrame({'currencies': [
[{'code':'AFN','name':'Afghan afghani'}],
[{'code':'EUR','name':'Euro'}],
[{'code':'AUD','name':'Australian dollar'},{'code':'GBP','name':'British pound'}],
[{'code':'XCD','name':'East Carribean dollar'}]]})
Solution 2:[2]
Here's how I would solve your problem, given that I understood it all correctly:
import pandas
df = pandas.DataFrame({
"x": ["A", "B", "C"],
"currencies": [[{"code": "x"}, {"code": "y"}], [{"code": "u"}], [{"code": "v"}]]
})
print(df)
> x currencies
0 A [{'code': 'x'}, {'code': 'y'}]
1 B [{'code': 'u'}]
2 C [{'code': 'v'}]
df["c_codes"] = df["currencies"].apply(lambda x: [i["code"] for i in x])
print(df)
> x currencies c_codes
0 A [{'code': 'x'}, {'code': 'y'}] [x, y]
1 B [{'code': 'u'}] [u]
2 C [{'code': 'v'}] [v]
What this apply method does is it iterates over the values in currencies and applies the lambda function to it: for each input (in this case a list of dicts) return a list of just the values of each code value. These results are then saved to the new column c_codes.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | richardec |
