'How to Convert CSV to JSON using Python where some JSON keys have multiple values

I am trying to convert csv to json but couldn't get the expected results

Pyton Code:

import csv
import json
  
def csv_to_json(csvFile, jsonFile):    
    jsondict = {}  
    with open(csvFile, encoding='utf_8_sig') as csvfile:  
        csv_data = csv.DictReader(csvfile)
        jsondict["data"]=[]
        for rows in csv_data:  
            print(rows)
            jsondict["data"].append(rows)
  
    with open(jsonFile, 'w') as jsonfile:  
        jsonfile.write(json.dumps(jsondict, indent = 4))
csvfile = 'sample.csv'  
jsonfile = 'sampleoutput.json'  
  
csv_to_json(csvfile, jsonfile)

Sample CSV Data:

Keys Value
100 value100-1
200 value200-1
200 value200-2
300 value300-1
300 value300-2
400 value400-1
400 value400-2
400 value400-3

Expected Result:

    [
      {
        "Keys": "100",
        "Value": "value100-1"
        },
        {
        "Keys": "200",
        "Value": ["value200-1", "value200-2"]
      },
      {
        "Keys": "300",
        "Value": ["value300-1", "value300-2"]
        },
        {
        "Keys": "400",
        "Value": ["value400-1", "value400-2", "value400-3"]
       }
    ]

Current Result:

{
    "data": [{
            "Keys": "100",
            "Value": "value100-1"
        },
        {
            "Keys": "200",
            "Value": "value200-1"
        },
        {
            "Keys": "200",
            "Value": "value200-2"
        },
        {
            "Keys": "300",
            "Value": "value300-1"
        },
        {
            "Keys": "300",
            "Value": "value300-2"
        },
        {
            "Keys": "400",
            "Value": "value400-1"
        },
        {
            "Keys": "400",
            "Value": "value400-2"
        },
        {
            "Keys": "400",
            "Value": "value400-3"
        }
    ]
}

TIA for you honest and immediate response :)



Solution 1:[1]

The above expected result require grouping of data inside csv file Use code below instead

import json
import io
import pandas as pd

raw_string = open("sample.csv", "r")
raw_string = ''.join([i for i in raw_string])  
raw_data = [line.split(',') for line in raw_string.split()]

df = pd.read_csv(io.StringIO(raw_string))

for item in list(df.groupby(by=[i for i in df.columns if not i.endswith("Value")])):
    df_temp = item[1]
    # messy line that matches columns to their values. The list(set(x) makes it so values are unique but also json serializable
    a = {k : (list(set(v)) if len(set(v)) > 1 else list(set(v))[0]) for k, v in df_temp.to_dict("list").items()}
    print(json.dumps(a, indent=4))

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 deceze