'Python: Write to CSV with multiple header rows

Context

I am looking to export a dictionary which contains a list of dictionaries as the value of each key:value pair:

dict = {'key_1':
        [{'key_a': foo_1, 'key_b': bar_1}, 
         {'key_a': foo_2, 'key_b': bar_2}], 
        'key_2':
        [{'key_c': foo_1, 'key_d': bar_1}], 
         {'key_c': foo_2, 'key_d': bar_2}] 
        ...}

The desired output will be a .csv file which has the keys of the first dictionary (key_1, key_2, etc.), as the first header row, then the keys of the nested dictionaries (key_a, key_b, etc.), as a second header row which corresponds to its respective key.

An example of the desired output looks like this, where the list index column refers to data stored within the dictionary at each respective index within the list of dictionaries:

╔════════════╦═══════════════╤═══════════════╗
║            ║     key_1     │     key_2     ║
║ List Index ╠═══════╤═══════╪═══════╤═══════╣
║            ║ key_a │ key_b │ key_c │ key_d ║
╠════════════╬═══════╪═══════╪═══════╪═══════╣
║     0      ║ foo_1 │ bar_1 │ foo_1 │ bar_1 ║
╟────────────╫───────┼───────┼───────┼───────╢
║     1      ║ foo_2 │ bar_2 │ foo_2 │ bar_2 ║
╟────────────╫───────┼───────┼───────┼───────╢
║     2      ║ foo_3 │ bar_3 │ foo_3 │ bar_3 ║
╚════════════╩═══════╧═══════╧═══════╧═══════╝

Platform: Raspberry Pi 3b+, Python 3.6


Code

Currently, I am looking into different options for doing this, so do not have any coherent code which comes near working. However, in order of preference, here are a few options which I am considering:

  • Use pandas to form an array which mirrors the nature of the desired table. Then write this to CSV directly.

  • Write to CSV from the dictionary data-structure described above.

    import csv
    
    field_names = dict.keys()
    header2 = {'%s' %dict.keys() : dict[key_1][0].keys()}
    
    with open('project_data.csv', 'a') as csvfile:
        writer = csv.DictWriter(csvfile, fieldnames=field_names)
        writer.writeheader()  # Write header row containing the top field names
        writer.writerow(header2)  #Write row containing the second field names
    

    As is evident, this code needs further development to make it work as desired.

  • Another method I have not considered?


The Question

What would be the best approach to write to a CSV in this format?



Solution 1:[1]

Here your "List Index" is a merged cell which you can't achieve in a CSV file because it doesn't contain formatting. You can either:

a) Write it to a xlsx file (XlsxWriter is a great library for this)

b) Keep it was a CSV but with an unmerged cell as anky_91 suggests

Solution 2:[2]

here's a solution for creating fieldnames for DictWriter() for a dictionary that contains dictionaries and also lists of dictionaries.

You need to walk the structure and generate the fieldnames along with a new dict that has those new names:

#!/usr/bin/env python3
import csv
import pprint as pp

myDict = {'key_1':
        [{'key_a': 'foo_1', 'key_b': 'bar_1'}, 
         {'key_a': 'foo_2', 'key_b': 'bar_2'}], 
        'key_2':
        [{'key_c': 'foo_1', 'key_d': 'bar_1'}, 
         {'key_c': 'foo_2', 'key_d': 'bar_2'}] }

def generateFieldnames(myDict):
    # create unique fieldnames from a dictionary containing dictionaries
    newDict={}
    fieldnames=[] # DictWriter will create a .csv with these header names
    
    for k,v in myDict.items():
        
        # is a dictionary?
        if (type(v) is dict):
            for kk,vv in v.items():
                print('k={0}, kk={1}, vv={2}'.format(k,kk,vv))
                name='{0}_{1}'.format(k,kk)
                fieldnames.append(name)
                newDict[name]=vv
                
        elif (type(v) is list):
            for item in range(len(v)):
                listItem=v.pop()
                if (type(listItem) is dict):
                    for kk,vv in listItem.items():
                        name='{0}_{1}'.format(k,kk)
                        fieldnames.append(name)
                        newDict[name]=vv
        
        else:
            print('k=[{0}] , v=[{1}]'.format(k,v))
            fieldnames.append(k)
            newDict[k]=v
    
    return fieldnames, newDict


# create fieldnames from the dictionary with lists and dictionaries
fieldnames, newDict=generateFieldnames(myDict)
pp.pprint(fieldnames)
print('\n')
pp.pprint(fieldnames)
print('\n\n')

# write a sample .csv with fieldnames as headers
fd = open('mytest.csv','a')
dw = csv.DictWriter( fd, fieldnames=fieldnames)

dw.writeheader() # write the header row

dw.writerow( newDict )
dw.writerow( newDict )
dw.writerow( newDict )

fd.close()

the result can be seen in the file mytest.csv:

key_1_key_a,key_1_key_b,key_1_key_a,key_1_key_b,key_2_key_c,key_2_key_d,key_2_key_c,key_2_key_d
foo_1,bar_1,foo_1,bar_1,foo_1,bar_1,foo_1,bar_1
foo_1,bar_1,foo_1,bar_1,foo_1,bar_1,foo_1,bar_1
foo_1,bar_1,foo_1,bar_1,foo_1,bar_1,foo_1,bar_1

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Phillip Watts
Solution 2