'How to concat dataframes iteratively from different folders using python
So I'm trying to concat (one after the other) dataframes from several different folders and put it on a single csv file, keeping just the first heading and erasing the following ones. What I want to do is make the code run through every folder in a specific directory, get the right csv file from there and add the data to a final csv folder. Here's what I've tried so far:
for runs in range(len(runlist)) :
if fnmatch.fnmatch(runlist[runs], '*month*') :
data_run=glob.glob((runlist[runs]+'/processing/*day*.csv')) ##setting the directory
for file in data_run:
final_df = pd.concat([pd.read_csv(file, header=None) for file in data_run], ignore_index=True, axis=0)
final_df.to_csv('date_all.csv', index=False)
date_all=pd.read_csv("date_all.csv")
date_all_col=date_all["7"]
date_all_col=pd.to_numeric(date_all_col, errors='coerce')
date_all_col.plot.hist()
The results I'm getting are like this:
0 WA()
1 23.12
2 12.15
3 13.52
Name: 7, dtype: object
0 WA()
1 35.18
2 14.85
3 26.16
Name: 7, dtype: object
0 WA()
1 62.12
2 45.52
3 18.22
And so on and when I plot a histogram: https://i.stack.imgur.com/lZpFa.png
It's just not right, I want the WA columns to be plot all together
Solution 1:[1]
for runs in range(len(runlist)):
if fnmatch.fnmatch(runlist[runs], '*month*'):
data_run = glob.glob((runlist[runs]+'/processing/*day*.csv')) ##setting the directory
final_df = pd.Dataframe()
for file in data_run:
added_df = pd.read_csv(file)['WA()']
added_df = pd.to_numeric(added_df, errors='coerce')
added_df.plot.hist()
final_df = pd.concat([final_df, added_df], axis=1)
final_df.to_csv('date_all.csv', index=False)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | BeRT2me |
