'Normalizing all levels of nested JSON

I am trying to normalize this nested JSON file but the 'rewardsReceiptItemList' will not flatten no matter what I do. I tried using json.normalize and record_path= argument.

Here is an example of the JSON:

{"result":[{"_id": {"$oid": "5ff1e1eb0a720f0523000575"}, "bonusPointsEarned": 500, "bonusPointsEarnedReason": "Receipt number 2 completed, bonus point schedule DEFAULT (5cefdcacf3693e0b50e83a36)", "createDate": {"$date": 1609687531000}, "dateScanned": {"$date": 1609687531000}, "finishedDate": {"$date": 1609687531000}, "modifyDate": {"$date": 1609687536000}, "pointsAwardedDate": {"$date": 1609687531000}, "pointsEarned": "500.0", "purchaseDate": {"$date": 1609632000000}, "purchasedItemCount": 5, "rewardsReceiptItemList": [{"barcode": "4011", "description": "ITEM NOT FOUND", "finalPrice": "26.00", "itemPrice": "26.00", "needsFetchReview": false, "partnerItemId": "1", "preventTargetGapPoints": true, "quantityPurchased": 5, "userFlaggedBarcode": "4011", "userFlaggedNewItem": true, "userFlaggedPrice": "26.00", "userFlaggedQuantity": 5}], "rewardsReceiptStatus": "FINISHED", "totalSpent": "26.00", "userId": "5ff1e1eacfcf6c399c274ae6"}, {"_id": {"$oid": "5ff1e1bb0a720f052300056b"}, "bonusPointsEarned": 150, "bonusPointsEarnedReason": "Receipt number 5 completed, bonus point schedule DEFAULT (5cefdcacf3693e0b50e83a36)", "createDate": {"$date": 1609687483000}, "dateScanned": {"$date": 1609687483000}, "finishedDate": {"$date": 1609687483000}, "modifyDate": {"$date": 1609687488000}, "pointsAwardedDate": {"$date": 1609687483000}, "pointsEarned": "150.0", "purchaseDate": {"$date": 1609601083000}, "purchasedItemCount": 2, "rewardsReceiptItemList": [{"barcode": "4011", "description": "ITEM NOT FOUND", "finalPrice": "1", "itemPrice": "1", "partnerItemId": "1", "quantityPurchased": 1}, {"barcode": "028400642255", "description": "DORITOS TORTILLA CHIP SPICY SWEET CHILI REDUCED FAT BAG 1 OZ", "finalPrice": "10.00", "itemPrice": "10.00", "needsFetchReview": true, "needsFetchReviewReason": "USER_FLAGGED", "partnerItemId": "2", "pointsNotAwardedReason": "Action not allowed for user and CPG", "pointsPayerId": "5332f5fbe4b03c9a25efd0ba", "preventTargetGapPoints": true, "quantityPurchased": 1, "rewardsGroup": "DORITOS SPICY SWEET CHILI SINGLE SERVE", "rewardsProductPartnerId": "5332f5fbe4b03c9a25efd0ba", "userFlaggedBarcode": "028400642255", "userFlaggedDescription": "DORITOS TORTILLA CHIP SPICY SWEET CHILI REDUCED FAT BAG 1 OZ", "userFlaggedNewItem": true, "userFlaggedPrice": "10.00", "userFlaggedQuantity": 1}], "rewardsReceiptStatus": "FINISHED", "totalSpent": "11.00", "userId": "5ff1e194b6a9d73a3a9f1052"},{"_id": {"$oid": "5ff475820a7214ada10005cf"}, "createDate": {"$date": 1609856386000}, "dateScanned": {"$date": 1609856386000}, "modifyDate": {"$date": 1609856386000}, "rewardsReceiptStatus": "SUBMITTED", "userId": "5a43c08fe4b014fd6b6a0612"}]}

After applying json normalize, it looks like this (one column that is still nested):

There is one column that is still nested

I tried using record path, but it shows a KeyError:

enter image description here



Solution 1:[1]

import json
import pandas as pd

file = """<your_json_string>"""

data = json.loads(file)

# Remove the records that don't have the required Key.
data['result'] = [x for x in data['result'] if x.get('rewardsReceiptItemList')]

df = pd.json_normalize(data, ['result', 'rewardsReceiptItemList'])
print(df)

Output:

        barcode                                        description finalPrice itemPrice needsFetchReview partnerItemId  ... needsFetchReviewReason               pointsNotAwardedReason             pointsPayerId                            rewardsGroup   rewardsProductPartnerId                             userFlaggedDescription
0          4011                                     ITEM NOT FOUND      26.00     26.00            False             1  ...                    NaN                                  NaN                       NaN                                     NaN                       NaN                                                NaN
1          4011                                     ITEM NOT FOUND          1         1              NaN             1  ...                    NaN                                  NaN                       NaN                                     NaN                       NaN                                                NaN
2  028400642255  DORITOS TORTILLA CHIP SPICY SWEET CHILI REDUCE...      10.00     10.00             True             2  ...           USER_FLAGGED  Action not allowed for user and CPG  5332f5fbe4b03c9a25efd0ba  DORITOS SPICY SWEET CHILI SINGLE SERVE  5332f5fbe4b03c9a25efd0ba  DORITOS TORTILLA CHIP SPICY SWEET CHILI REDUCE...

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1