'Combine Dataframes resulting from a for loop

I need a little help in appending the data thats getting generated out of the for loop below. Currenlty, im writing it to a dataframe in line "df = pd.DataFrame(li_row, columns=col_names)"

But when I have multiple files which starts from PAJ, I need the resulted Dataframe to be appended to one Dataframe.

Also, the below is a bits and pieces we gathered and amended to suit our need. please excuse me in case you feel its a mess. :)

import xmlschema
import os
import xml.etree.ElementTree as ET
import pandas as pd


dirpath = "C:\\Users\\xxxxx\\PycharmProjects\\pythonProject\\xmls"
filenames = os.listdir("C:\\Users\\xxxxx\\PycharmProjects\\pythonProject\\xmls")
# print(filenames)

for eachfile in filenames:
    fname = eachfile[0:3]
    print(dirpath+'\\'+eachfile)
    if fname == 'PAJ':
        xmlschema.validate(dirpath+'\\'+eachfile, 'PAJ.xsd')
        tree = ET.parse(eachfile)
        root = tree.getroot()
        # Get AlertID from header
        cols = {}
        for header in root.findall(".//header/alertId"):
            cols[header.tag] = header.text
        # print(cols)


        # get detailhr to be used for column header names
        col_names = []
        for DtHeader in root.findall(".//detailHdr/c"):
            col_names.append(DtHeader.text)
        # print(col_names)

        # Get row and c
        li_row = []
        size = 0
        for Data in root.findall(".//report/data"):

            for child in Data:
                # print(child.tag,child.text,len(Data))
                li_row.append([])
                for grandchild in child:
                    # print(grandchild.tag, grandchild.text,len(child))
                    li_row[size].append(grandchild.text)

                size += 1

        # print(li_row)

        # create a dataframe with the col_names and row with c and alertid added at the end
        df = pd.DataFrame(li_row, columns=col_names)
        df['alertId'] = cols['alertId']
        print(df)
    elif fname == 'PIE':
        fileContent = ''
        with open(dirpath + '\\' + eachfile) as filehandle:
            fileContent = filehandle.read()
        modFileContent = fileContent.replace("UTF-16", "UTF-8")
        xmlschema.validate(modFileContent, 'PIE.xsd')

python pandas xml

Solution 1:^[1]

So if i were to change your current solution as little as possible I create a list of paj_data_frames and concatenate them once the script was done. Look at pd.concat documentation https://pandas.pydata.org/docs/user_guide/merging.html

paj_data_frames = []
for eachfile in filenames:
    ....
    if fname == 'PAJ':
        df = pd.DataFrame(li_row, columns=col_names)
        df['alertId'] = cols['alertId']
        paj_data_frames.append(df)
    ....
final_df = pd.concat(paj_data_frames)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	tomvonheill

'Combine Dataframes resulting from a for loop

Solution 1:[1]

Sources

Related Questions

Solution 1:^[1]