'How to Convert CSV to JSON using Python where some JSON keys have multiple values
I am trying to convert csv to json but couldn't get the expected results
Pyton Code:
import csv
import json
def csv_to_json(csvFile, jsonFile):
jsondict = {}
with open(csvFile, encoding='utf_8_sig') as csvfile:
csv_data = csv.DictReader(csvfile)
jsondict["data"]=[]
for rows in csv_data:
print(rows)
jsondict["data"].append(rows)
with open(jsonFile, 'w') as jsonfile:
jsonfile.write(json.dumps(jsondict, indent = 4))
csvfile = 'sample.csv'
jsonfile = 'sampleoutput.json'
csv_to_json(csvfile, jsonfile)
Sample CSV Data:
| Keys | Value |
|---|---|
| 100 | value100-1 |
| 200 | value200-1 |
| 200 | value200-2 |
| 300 | value300-1 |
| 300 | value300-2 |
| 400 | value400-1 |
| 400 | value400-2 |
| 400 | value400-3 |
Expected Result:
[
{
"Keys": "100",
"Value": "value100-1"
},
{
"Keys": "200",
"Value": ["value200-1", "value200-2"]
},
{
"Keys": "300",
"Value": ["value300-1", "value300-2"]
},
{
"Keys": "400",
"Value": ["value400-1", "value400-2", "value400-3"]
}
]
Current Result:
{
"data": [{
"Keys": "100",
"Value": "value100-1"
},
{
"Keys": "200",
"Value": "value200-1"
},
{
"Keys": "200",
"Value": "value200-2"
},
{
"Keys": "300",
"Value": "value300-1"
},
{
"Keys": "300",
"Value": "value300-2"
},
{
"Keys": "400",
"Value": "value400-1"
},
{
"Keys": "400",
"Value": "value400-2"
},
{
"Keys": "400",
"Value": "value400-3"
}
]
}
TIA for you honest and immediate response :)
Solution 1:[1]
The above expected result require grouping of data inside csv file Use code below instead
import json
import io
import pandas as pd
raw_string = open("sample.csv", "r")
raw_string = ''.join([i for i in raw_string])
raw_data = [line.split(',') for line in raw_string.split()]
df = pd.read_csv(io.StringIO(raw_string))
for item in list(df.groupby(by=[i for i in df.columns if not i.endswith("Value")])):
df_temp = item[1]
# messy line that matches columns to their values. The list(set(x) makes it so values are unique but also json serializable
a = {k : (list(set(v)) if len(set(v)) > 1 else list(set(v))[0]) for k, v in df_temp.to_dict("list").items()}
print(json.dumps(a, indent=4))
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | deceze |
