'Error while Loading stack of dictionaries to DataFrame in for loop
I am reading excel files in folder by filtering some files and looping though the files to get data. When I read excel, I am getting stack of dictionaries and I am trying to convert the dictionary to DataFrame but I am getting error. Below is my code
How can I get dictionaries to DataFrame?
p = Path(path to excel files)
filtered_files = [x for x in p.glob("**/*.xlsx") if not x.name.__contains__("AC0") and
not x.name.startswith(".")
for i, file in enumerate(filtered_files):
# read file into a DataFrameF
df_xl = pd.read_excel(file, sheet_name=[0], header=0)
df = pd.concat({k: pd.Series(v) for k, v in df_xl.items()}).reset_index()
output of df_xl:
{0: A ... Unnamed: 4
0 A01 ... NaN
1 A02 ... NaN
2 A03 ... NaN
3 A04 ... NaN
4 A04a ... NaN
.. ... ... ...
92 NaN ... NaN
93 NaN ... NaN
94 NaN ... NaN
95 NaN ... NaN
96 NaN ... NaN
[97 rows x 5 columns]}
{0: A ... Unnamed: 4
0 A01 ... NaN
1 A02 ... NaN
2 A03 ... NaN
3 A04 ... NaN
4 A04a ... NaN
.. ... ... ...
92 NaN ... NaN
93 NaN ... NaN
94 NaN ... NaN
95 NaN ... NaN
96 NaN ... NaN
[97 rows x 5 columns]}
Error:
The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Solution 1:[1]
It is hard to say for sure without seeing your excel file(s), but it looks like df_xl is a dict that has DataFrames as values keyed by integers? Can you at least add the output of type(df_xl) to your question?
Assuming this is the case, it is still not clear from your output if the dataframes are all the same size, but if they are this should work
df = pd.concat(df_xl.values(), axis=1).reset_index()
or
df = pd.concat(df_xl.values(), ignore_index = True, axis=0).reset_index()
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | piterbarg |
