'Normalizing all levels of nested JSON
I am trying to normalize this nested JSON file but the 'rewardsReceiptItemList' will not flatten no matter what I do. I tried using json.normalize and record_path= argument.
Here is an example of the JSON:
{"result":[{"_id": {"$oid": "5ff1e1eb0a720f0523000575"}, "bonusPointsEarned": 500, "bonusPointsEarnedReason": "Receipt number 2 completed, bonus point schedule DEFAULT (5cefdcacf3693e0b50e83a36)", "createDate": {"$date": 1609687531000}, "dateScanned": {"$date": 1609687531000}, "finishedDate": {"$date": 1609687531000}, "modifyDate": {"$date": 1609687536000}, "pointsAwardedDate": {"$date": 1609687531000}, "pointsEarned": "500.0", "purchaseDate": {"$date": 1609632000000}, "purchasedItemCount": 5, "rewardsReceiptItemList": [{"barcode": "4011", "description": "ITEM NOT FOUND", "finalPrice": "26.00", "itemPrice": "26.00", "needsFetchReview": false, "partnerItemId": "1", "preventTargetGapPoints": true, "quantityPurchased": 5, "userFlaggedBarcode": "4011", "userFlaggedNewItem": true, "userFlaggedPrice": "26.00", "userFlaggedQuantity": 5}], "rewardsReceiptStatus": "FINISHED", "totalSpent": "26.00", "userId": "5ff1e1eacfcf6c399c274ae6"}, {"_id": {"$oid": "5ff1e1bb0a720f052300056b"}, "bonusPointsEarned": 150, "bonusPointsEarnedReason": "Receipt number 5 completed, bonus point schedule DEFAULT (5cefdcacf3693e0b50e83a36)", "createDate": {"$date": 1609687483000}, "dateScanned": {"$date": 1609687483000}, "finishedDate": {"$date": 1609687483000}, "modifyDate": {"$date": 1609687488000}, "pointsAwardedDate": {"$date": 1609687483000}, "pointsEarned": "150.0", "purchaseDate": {"$date": 1609601083000}, "purchasedItemCount": 2, "rewardsReceiptItemList": [{"barcode": "4011", "description": "ITEM NOT FOUND", "finalPrice": "1", "itemPrice": "1", "partnerItemId": "1", "quantityPurchased": 1}, {"barcode": "028400642255", "description": "DORITOS TORTILLA CHIP SPICY SWEET CHILI REDUCED FAT BAG 1 OZ", "finalPrice": "10.00", "itemPrice": "10.00", "needsFetchReview": true, "needsFetchReviewReason": "USER_FLAGGED", "partnerItemId": "2", "pointsNotAwardedReason": "Action not allowed for user and CPG", "pointsPayerId": "5332f5fbe4b03c9a25efd0ba", "preventTargetGapPoints": true, "quantityPurchased": 1, "rewardsGroup": "DORITOS SPICY SWEET CHILI SINGLE SERVE", "rewardsProductPartnerId": "5332f5fbe4b03c9a25efd0ba", "userFlaggedBarcode": "028400642255", "userFlaggedDescription": "DORITOS TORTILLA CHIP SPICY SWEET CHILI REDUCED FAT BAG 1 OZ", "userFlaggedNewItem": true, "userFlaggedPrice": "10.00", "userFlaggedQuantity": 1}], "rewardsReceiptStatus": "FINISHED", "totalSpent": "11.00", "userId": "5ff1e194b6a9d73a3a9f1052"},{"_id": {"$oid": "5ff475820a7214ada10005cf"}, "createDate": {"$date": 1609856386000}, "dateScanned": {"$date": 1609856386000}, "modifyDate": {"$date": 1609856386000}, "rewardsReceiptStatus": "SUBMITTED", "userId": "5a43c08fe4b014fd6b6a0612"}]}
After applying json normalize, it looks like this (one column that is still nested):
I tried using record path, but it shows a KeyError:
Solution 1:[1]
import json
import pandas as pd
file = """<your_json_string>"""
data = json.loads(file)
# Remove the records that don't have the required Key.
data['result'] = [x for x in data['result'] if x.get('rewardsReceiptItemList')]
df = pd.json_normalize(data, ['result', 'rewardsReceiptItemList'])
print(df)
Output:
barcode description finalPrice itemPrice needsFetchReview partnerItemId ... needsFetchReviewReason pointsNotAwardedReason pointsPayerId rewardsGroup rewardsProductPartnerId userFlaggedDescription
0 4011 ITEM NOT FOUND 26.00 26.00 False 1 ... NaN NaN NaN NaN NaN NaN
1 4011 ITEM NOT FOUND 1 1 NaN 1 ... NaN NaN NaN NaN NaN NaN
2 028400642255 DORITOS TORTILLA CHIP SPICY SWEET CHILI REDUCE... 10.00 10.00 True 2 ... USER_FLAGGED Action not allowed for user and CPG 5332f5fbe4b03c9a25efd0ba DORITOS SPICY SWEET CHILI SINGLE SERVE 5332f5fbe4b03c9a25efd0ba DORITOS TORTILLA CHIP SPICY SWEET CHILI REDUCE...
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |


