'How to update a df using a for loop and arrays on Python?
Suppose that I create the following df:
import pandas as pd
#column names
column_names = ["Time", "Currency", "Volatility expected", "Event", "Actual", "Forecast", "Previous"]
#create a dataframe including the column names
df = pd.DataFrame(columns=column_names)
Then, I create the following array that will have the cell values to add to my df:
rows = ["2:00", "GBP", "", "Construction Output (MoM) (Jan)", "1.1%", "0.5%", "2.0%",
"2:00", "GBP", "", "U.K. Construction Output (YoY) (Jan)", "9.9%", "9.2%", "7.4%"]
So, how can I use a for loop to update my df so it ends up like this:
|Time |Currency |Volatility expected |Event |Actual |Forecast |Previous |
------------------------------------------------------------------------------------------------------------------
|02:00 |GBP | |Construction Output (MoM) (Jan) |1.1% |0.5% |2.0% |
|04:00 |GBP | |U.K. Construction Output (YoY) (Jan)|9.9% |9.2% |7.4% |
I tried:
column_name_location = 0
for row in rows:
df.at['0', df[column_name_location]] = row
column_name_location += 1
print(df)
But got:
KeyError: 0
May I get some advice here?
Solution 1:[1]
If rows is a flat list of items, you can convert it to a numpy array to reshape it first
Assuming rows is actualy a list of sub-lists, each sub-list being a row, you can create a pd.Series from each row using the dataframe's column names as the Series's index, and then use df.append to append them all:
df.append([pd.Series(r, index=df.columns) for r in rows])
If rows is actually just a flat list, you'll need to convert it to a numpy array to reshape it:
rows = np.array(rows).reshape(-1, 7).tolist()
Solution 2:[2]
It looks like you have created one list containing 14 items. You could instead make it as a list containing 2 items where each item is a list with 7 values.
rows = [["2:00", "GBP", "", "Construction Output (MoM) (Jan)", "1.1%", "0.5%", "2.0%"],
["2:00", "GBP", "", "U.K. Construction Output (YoY) (Jan)", "9.9%", "9.2%", "7.4%"]]
With this, we can create a dataframe directly as shown below
df = pd.DataFrame(rows, columns=column_names)
print(df)
This outputs 2 rows
Time Currency Volatility expected Event Actual Forecast Previous
0 2:00 GBP Construction Output (MoM) (Jan) 1.1% 0.5% 2.0%
1 2:00 GBP U.K. Construction Output (YoY) (Jan) 9.9% 9.2% 7.4%
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | richardec |
| Solution 2 | Manjunath K Mayya |
