'Converting varying JSON objects to CSV-file
I am calling an API and it returns JSON string. I want to convert it to CSV-format so I can save it later to database. However, JSON objects keys cause problems because there is keys missing or keys are changing. I wrote this python script but because of keys I cannot get it to work:
import json
import csv
with open('custom.json') as json_file:
data = json.load(json_file)
custom_data = data['CustomJSON']
data_file = open('data_file.csv', 'w')
csv_writer = csv.writer(data_file)
count = 0
for i in custom_data:
if count == 0:
# Writing headers of CSV file
header = i.keys()
csv_writer.writerow(header)
count += 1
# Writing data of CSV file
csv_writer.writerow(i.values())
data_file.close()
How I can convert this type of JSON to CSV? Example JSON message:
{
"CustomJSON" : [
{
"id" : "1,
"name" : "Jack",
"surname" : "Bauer"
},
{
"id" : "2",
"name" : "John",
"surname" : "Smith"
"age" : "31",
"city" : "New York"
},
{
"id" : "3",
"name" : "Matt",
"surname" : "Secret"
"exception_1" : "Exception_1",
"exception_2" : "Exception_2"
"date" : "2022-02-08"
}
]
}
Should I try to loop all key-values first somehow and then later try to add data? Can anyone provide an example?
Solution 1:[1]
As you are reading a single JSON string, you will have everything in memory. So IMHO the simplest way is to first build the field names list, and then write everything to a csv file.
# compute the fieldnamelist
# this uses a dict because it is easy to update it while maintaining key order
keys = dict()
for d in data['CustomJSON']:
keys.update(d)
# write to the csv file
# this uses a DictWriter because the individual rows are already dicts
with open('data_file.csv', 'w', newline='') as data_file
csv_writer = csv.DictWriter(data_file, fieldnames = keys.keys())
_ = csv_writer.writeheader()
_ = csv_writer.writerows(data['CustomJSON'])
With your data it gives as expected:
id,name,surname,age,city,exception_1,exception_2,date
1,Jack,Bauer,,,,,
2,John,Smith,31,New York,,,
3,Matt,Secret,,,Exception_1,Exception_2,2022-02-08
Solution 2:[2]
I am a pandas fan(atic) so I'd do something like
import pandas as pd
# df is a pandas dataframe)
df = pd.read_json('http://data.com/foo')
df.to_csv('foo.csv')
Pandas has options for the CSV dialect, if need be. You should be able to do what you describe with those two function calls, though.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Serge Ballesta |
| Solution 2 | pletnes |
