'Can you perform an identical operation on each Pandas dataframe in a list?

I have a list of dataframes corresponding to different countries, each formatted the same way.

Here's AUH for example, the dataframe for Austria-Hungary:

    stateabb    ccode   year    milex   milper  irst    pec tpop    upop    cinc    version eu_gp_cinc  eu_gp_irst  eu_cinc
144 AUH 300 1871    9326    260 425 6874    36140.0 1232.0  0.042516    2021    0.677020    9828    0.062799
145 AUH 300 1872    9130    264 460 6987    36450.0 1265.0  0.046151    2021    0.639875    10790   0.072125
146 AUH 300 1873    9762    292 534 7596    36322.0 1299.0  0.047668    2021    0.637254    10997   0.074802
147 AUH 300 1874    9818    285 494 7263    36222.0 1333.0  0.047428    2021    0.639522    10081   0.074162
148 AUH 300 1875    9723    303 463 7090    36436.0 1369.0  0.046975    2021    0.646770    10603   0.072631

My dataframes are in a list, like:

great_powers = ["AUH", "GMY", "TUR"]

Where GMY is Germany, TUR is Turkey, etc.

If I wanted to do a column operation, like squaring irst and dividing by tpop for every dataset, would that be possible?

Initially I would try:

for dataset in great_powers:
dataset["IRST Square Divided By Pop]" = dataset["irst"]*dataset["irst"]/dataset["tpop"]

But previous attempts (where I was resetting the index for each one, not doing the column multiplication) would simply define a new dataframe each time.

Thank you.



Solution 1:[1]

Yes, it's possible. I don't understand why your previous attempt didn't work; it was probably a typo.

This creates a new column in each dataframe:

import pandas as pd

# Create some dataframes from dictionaries
d = {'col1': [1, 2], 'col2': [3, 4], 'col3': [5, 6], 'col4': [7, 8]}
e = {'col1': [1, 2], 'col2': [3, 4], 'col3': [5, 6], 'col4': [7, 8]}
f = {'col1': [1, 2], 'col2': [3, 4], 'col3': [5, 6], 'col4': [7, 8]}

dfd = pd.DataFrame(data=d)
dfe = pd.DataFrame(data=e)
dff = pd.DataFrame(data=f)

# Create a list of dataframes
dfs = [dfd, dfe, dff]

# Iterate over dataframes, doing stuff
for df in dfs:
    df['new'] = df['col1']*df['col2']/df['col3']

Note: Make sure the values in your list of dataframes aren't strings. In your question they are inside quotes, and they shouldn't be.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 baileythegreen