'Python, how to write to csv file with dynamic table

I have a big csv file (1Gb), that I need to read, process and write to a new csv file in this format.

TimeStamp, header1, header2, header3, ....
0:00:53:922:131, data1, , , ....
0:00:54:922:131, , data2, , ....
0:00:55:922:131, , , data3, ....

Input csv file has below format (some doesn't have header)

Timestamp, info1, info2, info3, ......
0:00:53:922:131, info_bit1,info_bit2 ,info_bit3 , ....
0:00:54:922:131, info_bit1,info_bit2 ,info_bit3 , ....
0:00:55:922:131, info_bit1,info_bit2 ,info_bit3 , ....

Headers and its data are retrieved as dictionary during processing the each line from the read csv file.

Here is snippet of my code:

import pandas as pd

for chunk in chunks:
    for col in chunk.values:
        ts = col[0]
        msgID = int(str(col[5]), 16)
        dlc = col[6]
        eol = 8+dlc
        concatData = ""

        for data in col[8:eol]:
            data = str(data).zfill(2)
            concatData += data
        hexData = bytes.fromhex(concatData)
        dataMap = {ts:test.decode(int(msgID), hexData)}
        pd.DataFrame.from_dict(dataMap, orient='index').to_csv(outputfile, mode='a')
        print('{} {}'.format(ts, test.decode(int(msgID), hexData)))

    time.sleep(0.01)

test.decode returns {'key_name1': value,'key_name2': value,'key_name3': value} and each call may have different size, some has 4, some has 8, etc. Also the same key will append again at different timestamp. I append timestamp to this dictionary and create a new dictionary.

Output I get is,

,key1_name1,key1_name2,key1_name3,,,,,,,,,,,,,,,
0:00:53:915:439,value1,value2,value3,,,,,,,,,,,,,,,
,key2_name1,key2_name2,key2_name3,key2_name4,,,,,,,,,,,,,,,
0:00:53:915:613,value1,value2,value3,value4,,,,,,,,,,,,,,,,

And below is what I want in output file:

Timestamp, key1_name1, key1_name2,key2_name1,key2_name2,key2_name3,key2_name4,,,,,,
0:00:53:915:439,key1_value1,key1_value2,key1_value3,,,,,,
0:00:53:915:613,,,,,,key2_value1,key2_value2,key2_value3,key2_value4,,,,,,,,,,,,
0:00:54:915:613,key1_value4,key1_value5,key1_value6,,,,,,,,
0:00:55:915:613,,,,,key2_value5,key2_value6,key2_value7,key2_value8,key2_value9,,,,,

It will have 100s of key names, which I would like to just keep in the header, although values will appear at different timestamps.

How can I append header to existing column and data to start from certain row/column?



Solution 1:[1]

I think you can use conventional way with open

my_data = [] # list of dict
with open('output.csv','a') as f:
    for d in my_data:
        f.write("".join([str(i)+"," for i in d.values()])[:-1]+"\n")

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 danangjoyoo