'Complex nested Dict to pandas dataframe
My dictionary
{'count': 1608,
'results': [{'entity': {'alternativeTitles': [],
'originalPublishers': [{'administratorPublishers': [{'id': None,
'ipId': '7040761',
'publisherName': 'THE ROYALTY NETWORK INC.',
'ipiNumber': '00255217086',
'hfaPublisherNumber': 'P1343F',
'publisherShare': 45,
'amend': False,
'email': None,
'phone': '+1-212-967-4300',
'website': None,
'mailingAddress': 'THE ROYALTY NETWORK, INC., THE ROYALTY NETWORK INC., 224 WEST 30TH STREET, STE. 1007, NEW YORK, NY, USA, 10001',
'hfaAgreement': False}],
'writers': [],
'isTopPublisher': False,
'id': None,
'ipId': '7305902',
'publisherName': 'WESTBURY MUSIC LTD',
'ipiNumber': '00144443097',
'publisherShare': 0,
'amend': False,
'totalWorks': None,
'hfaPublisherNumber': 'P9669N',
'associatedUsers': 0,
'email': None,
'phone': '',
'website': None,
'mailingAddress': '',
'hfaAgreement': False},
{'administratorPublishers': [{'id': None,
'ipId': '7304677',
'publisherName': 'ROYNET MUSIC',
'ipiNumber': '00339668123',
'hfaPublisherNumber': 'P9713E',
'publisherShare': 50,
'amend': False,
'email': None,
'phone': '+1-212-967-4300',
'website': None,
'mailingAddress': 'THE ROYALTY NETWORK, INC., ROYNET MUSIC, 224 WEST 30TH STREET, STE. 1007, NEW YORK, NY, USA, 10001',
'hfaAgreement': False}],
'writers': [],
'isTopPublisher': False,
'id': None,
'ipId': '7305902',
'publisherName': 'WESTBURY MUSIC LTD',
'ipiNumber': '00144443097',
'publisherShare': 0,
'amend': False,
'totalWorks': None,
'hfaPublisherNumber': 'P9669N',
'associatedUsers': 0,
'email': None,
'phone': '',
'website': None,
'mailingAddress': '',
'hfaAgreement': False}],
'writers': [{'id': None,
'localId': None,
'ipId': '11395522',
'firstName': 'TIM',
'lastName': 'ATACK',
'ipiNumber': '00121504722',
'roleCode': 10,
'writerShare': 0,
'amend': False,
'worksCount': 0},
{'id': None,
'localId': None,
'ipId': '8516232',
'firstName': "DES'REE",
'lastName': 'WEEKES',
'ipiNumber': '',
'roleCode': 10,
'writerShare': 0,
'amend': False,
'worksCount': 0}],
'recordings': [{'id': None,
'title': 'KISSING YOU LOVE',
'artistName': '',
'isrc': 'GBMJG1200449',
'duration': '00:00',
'publishOwner': None,
'publishYear': None,
'albumTitle': None,
'label': 'THE CAIRN STRIN',
'releaseDate': '2017-06-30 00:00:00',
'amend': False},
{'id': None,
'title': 'KISSING YOU LOVE',
'artistName': "DES'REE",
'isrc': None,
'duration': '00:00',
'publishOwner': None,
'publishYear': None,
'albumTitle': None,
'label': '',
'releaseDate': '',
'amend': False}],
'oldDerivative': None,
'preTerminationEndDate': None,
'terminationDate': None,
'holdTypeId': 0,
'id': 810545286,
'title': 'KISSING YOU LOVE',
'duration': '00:00',
'languageCode': None,
'copyrightOfficeNumber': None,
'copyrightOfficeDate': None,
'firstUseRefusalIndicator': False,
'isComplete': True,
'propertyId': '810545286',
'rightsholderProprietaryId': None,
'iswc': None,
'updateDate': None,
'totalKnownShares': 95,
'registrationHistoryStatus': None,
'registrationHistoryIndicator': None,
'registrationHistoryDate': None,
'submissionDate': None,
'submissionId': None,
'songCode': 'KV7LMB'}}],
'error': None}
I want to explode this correctly.
Expected output
| count | results | entity | alternativeTitles | originalPublishers | administratorPublishers | id |
|---|---|---|---|---|---|---|
| 1608 | N/A | N/A | N/A | N/A | N/A | N/A |
Continuing the table for better understand
| ipId | publisherName | ipiNumber | hfaPublisherNumber | PublisherShare |
|---|---|---|---|---|
| 7040761 | THE ROYALTY NETWORK INC. | 00255217086 | P1343F | 45 |
My current code
result = requests.post(url, json=payload).json()
df = json_normalize(result)
This is not exploding the data properly as expected, I want each key(Including nested keys) as a column name.
Solution 1:[1]
I think what you're looking for is a multi index dataframe. You can create one from a nested dictionary by iterating over the outer and inner keys is the dictionary.
This article shows an example of how this is done:
# Import module
import pandas as pd
# Nested dictionary to convert it into multiindex dataframe
nested_dict = {'India': {'State': ['Maharashtra', 'West Bengal',
'Uttar Pradesh', 'Bihar', 'Karnataka'],
'Capital': ['Mumbai', 'Kolkata', 'Lucknow',
'Patna', 'Bengaluru']},
'America': {'State': ['California', 'Florida', 'Georgia',
'Massachusetts', 'New York'],
'Capital': ['Sacramento', 'Tallahassee', 'Atlanta',
'Boston', 'Albany']}}
reformed_dict = {}
for outerKey, innerDict in nested_dict.items():
for innerKey, values in innerDict.items():
reformed_dict[(outerKey, innerKey)] = values
# Display multiindex dataframe
multiIndex_df = pd.DataFrame(reformed_dict)
multiIndex_df
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Anne0102 |
