'Python API JSON Output Flatten - List within Response
I'm trying to completely flatten my json response from an API into a pandas dataframe and have no idea how to flatten out a list of objects within the response - This relates to the "Lines" column located in the documentation here and below.
"Lines" : [
{
"Account" : {
"UID" : "17960eb4-3e14-4805-aae2-5b2387da1153",
"Name" : "Trade Debtors",
"DisplayID" : "1-1310",
"URI" : "{cf_uri}/GeneralLedger/Account/17960eb4-3e14-4805-aae2-5b2387da1153"
},
"Amount" : 100,
"IsCredit" : false,
"Job" : null,
"LineDescription" : ""
"ReconciledDate" : null,
"UnitCount": null
},
{
"Account" : {
"UID" : "f7d18c92-ada8-428e-b02a-9223022f84b2",
"Name" : "Late Fees Collected",
"DisplayID" : "4-3000",
"URI" : "{cf_uri}/GeneralLedger/Account/f7d18c92-ada8-428e-b02a-9223022f84b2"
},
"Amount" : 90.91,
"IsCredit" : true,
"Job" : null,
"LineDescription" : "Line 1 testing",
"UnitCount": null
},
{
"Account" : {
"UID" : "5427d47c-499a-4386-ad67-72de39520a00",
"Name" : "GST Collected",
"DisplayID" : "2-1210",
"URI" : "{cf_uri}/GeneralLedger/Account/5427d47c-499a-4386-ad67-72de39520a00"
},
"Amount" : 9.09,
"IsCredit" : true,
"Job" : null,
"LineDescription" : "",
"ReconciledDate" : null,
"UnitCount": null
}
],
My Code:
import pandas as pd
import requests
payload={}
headers = {
'x-myobapi-key': client_id,
'x-myobapi-version': 'v2',
'Accept-Encoding': 'gzip,deflate',
'Authorization': f'Bearer {access_token}'
}
response = requests.request("GET", url, headers=headers, data=payload)
result = response.json()
df = pd.json_normalize(result, 'Items')
while result['NextPageLink'] is not None:
response = requests.request("GET", result['NextPageLink'], headers=headers, data=payload)
result = response.json()
df1 = pd.json_normalize(result, 'Items')
df = df.append(df1)
This code above appends each page of results until there isn't a link, as you can see the following output was able to expand the SourceTransactions columns but not the Lines columns as it appears to be in list format?
In order for me to access lines I need to use the following result["Items"][0]["Lines"] except that's only for the first element
+------+-----------+-------------+--------------+------------+-------------+--------------------------------+------------------------+------------+-----------------------+-----------------------------------+-----------------------+
| UID | DisplayID | JournalType | DateOccurred | DatePosted | Description | Lines | URI | RowVersion | SourceTransaction.UID | SourceTransaction.TransactionType | SourceTransaction.URI |
+------+-----------+-------------+--------------+------------+-------------+--------------------------------+------------------------+------------+-----------------------+-----------------------------------+-----------------------+
| a100 | PJ001 | Purchase | 2022-01-01 | 2022-01-01 | Transaction | [{'Account': {'UID': '73971... | https://arl1.api.my... | -139 | e06f592c-23b | Bill | https://arl1.api... |
+------+-----------+-------------+--------------+------------+-------------+--------------------------------+------------------------+------------+-----------------------+-----------------------------------+-----------------------+
Solution 1:[1]
For others that stumble across the same problem, turns out reading the documentation helped me - Who knew?!
I just needed to make a few tweaks with the json_normalize function. Also the order you type the meta parameters in matters, so you'll need to ensure your order matches the API documentation.
df = pd.json_normalize(data=result["Items"], record_path='Lines', meta=['UID',
'DisplayID','JournalType', ['SourceTransaction', 'UID'], ['SourceTransaction', 'TransactionType'], ['SourceTransaction',
'URI'], 'DateOccurred','DatePosted','Description','URI','RowVersion'])
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
