'Flatten JSON array into long form

I am having trouble flattening a json array into the format that I need. It is a fairly complex json with nested parts for various sections - see a simplified version below.

{
"RESPONSE":{"@VersionID":"1.1","@ResponseID":"A0001"}, 
"SUMMARY": {
      "@PersonID": "Person01",
      "@_Name": "Attributes",
      "_DATA_SET": [
        {
          "@_Name": "Number of accounts",
          "@_Value": "27"
        },
        {
          "@_Name": "Average age of open accounts",
          "@_Value": "35"
        },
        {
          "@_Name": "Number of closed accounts",
          "@_Value": "4"
        }
                    ]
                   }
}

I have a dataset where one of the columns contains a json like the one above in each row. For each row, I want to parse the summary section (specifically _DATA_SET) into a long format so that I can eventually pivot each @Name into a different column.

Current data:

row_id | json_example
1      | json_example1 
2      | json example2

Desired output:

row id | Number of accounts | Average age of open accounts | Number of closed accounts 
1      | 27                 | 35                           | 4
2      | 27                 | 35                           | 4

I have tried the following code which will parse my json_example into various columns of which one of them is the summary column, but I can not figure out how to further parse that into various rows which I can then pivot into columns.

pd.json_normalize(df.json_example.apply(json.loads)) yields 

RESPONSE.VersionID | RESPONSE.ResponseID | SUMMARY.@PersonID | SUMMARY.@_Name| SUMMARY._DATA_SET
                                                                               [{'@_Name': 'Number of accounts'...

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'Flatten JSON array into long form

Sources

Related Questions