'Pandas loop through Excel sheets and append to df
I am trying to loop through an Excel sheet and append the data from multiple sheets into a data frame.
So far I have:
master_df = pd.DataFrame()
for sheet in target_sheets:
df1 = file.parse(sheet, skiprows=4)
master_df.append(df1, ignore_index=True)
But then when I call master_df.head() it returns __
The data on these sheets is in the same format and relate to each other.
So I would like to join them like this:
Sheet 1 contains:
A1
B1
C1
Sheet 2 contains:
A2
B2
C2
Sheet 3:
A3
B3
C3
End result:
A1
B1
C1
A2
B2
C2
A3
B3
C3
Is my logic correct or how can I achieve this?
Solution 1:[1]
Simply use pd.concat():
pd.concat([pd.read_excel(file, sheet_name=sheet) for sheet in ['Sheet1','Sheet2','Sheet3']], axis=1)
For example, will yield:
A1 B1 C1 A2 B2 C2 A3 B3 C3
0 1 2 3 1 2 3 1 2 3
1 4 5 6 4 5 6 4 5 6
2 7 8 9 7 8 9 7 8 9
Solution 2:[2]
Below code will work even if you don't know the exact sheet_names in the excel file. You can try this:
import pandas as pd
xls = pd.ExcelFile('myexcel.xls')
out_df = pd.DataFrame()
for sheet in xls.sheet_names:
df = pd.read_excel('myexcel.xls', sheet_name=sheet)
out_df.append(df) ## This will append rows of one dataframe to another(just like your expected output)
print(out_df)
## out_df will have data from all the sheets
Let me know if this helps.
Solution 3:[3]
The output desired in the question is obtained by setting axis=0.
import pandas as pd
df2 = pd.concat([pd.read_excel(io="projects.xlsx", sheet_name=sheet) for sheet in ['JournalArticles','Proposals','Books']], axis=0)
df2
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | rahlf23 |
| Solution 2 | |
| Solution 3 | Blaine Mooers |
