'How to read list of json objects from Pandas DataFrame?

I want just want to loop through the array of json objects, and get the values of 'box'.....

I have a DataFrame which looks like this

       img                    facesJson
0   2b26mn4.jpg [{'box': [57, 255, 91, 103], 'confidence': 0.7...
1   cd7ntf.jpg  [{'box': [510, 85, 58, 87], 'confidence': 0.99...
2   m9kf3e.jpg  [{'box': [328, 78, 93, 123], 'confidence': 0.9...
3   b4hx0n.jpg  [{'box': [129, 30, 38, 54], 'confidence': 0.99...
4   afx0fm.jpg  [{'box': [86, 126, 221, 298], 'confidence': 0....

and the column 'facesJson' (dstype = object) contain array of json objects which look like this:

[
   {
      "box":[ 158,115,84,112 ],
      "confidence":0.9998929500579834,
   },
   {
      "box":[ 404,105, 86,114 ],
      "confidence":0.9996863603591919,
   }
]

when i run this code

for index,row in df.iterrows():
  df = pd.json_normalize(row['facesJson'])
  print(len(df))

i get this error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-53-8be848c6d44a> in <module>()
      1 for index,row in savedList.iterrows():
----> 2   df = pd.json_normalize(row['facesJson'])
      3   print(len(df))

1 frames
/usr/local/lib/python3.6/dist-packages/pandas/io/json/_normalize.py in <genexpr>(.0)
    272 
    273     if record_path is None:
--> 274         if any([isinstance(x, dict) for x in y.values()] for y in data):
    275             # naive normalization, this is idempotent for flat records
    276             # and potentially will inflate the data considerably for

AttributeError: 'str' object has no attribute 'values'


Solution 1:[1]

You can use this code on loop for each item of that column.

import ast

line='[ { "box":[ 158,115,84,112 ], "confidence":0.9998929500579834, }, { "box":[ 404,105, 86,114 ], "confidence":0.9996863603591919, } ]'


parsed=ast.literal_eval(line)

print(parsed[0].keys())

Solution 2:[2]

I would suggest using a list comprehension to look up the value of 'box' in each JSON object:

df['facesJson'].apply(lambda json: [item['box'] for item in json])

Solution 3:[3]

That something I run into, it's may help someone, so I have data as follow:

txt = """{
    "timeStamp": 1,
    "src": [
        { "id": 0, "tag": "", "x": 0.000, "y": 0.000, "z": 0.000, "activity": 0.000 },
        { "id": 0, "tag": "", "x": 0.000, "y": 0.000, "z": 0.000, "activity": 0.000 },
        { "id": 0, "tag": "", "x": 0.000, "y": 0.000, "z": 0.000, "activity": 0.000 },
        { "id": 0, "tag": "", "x": 0.000, "y": 0.000, "z": 0.000, "activity": 0.000 }
    ]
}"""

I need to convert the "src" into an array as follow:

array([[0, '', 0.0, 0.0, 0.0, 0.0],
       [0, '', 0.0, 0.0, 0.0, 0.0],
       [0, '', 0.0, 0.0, 0.0, 0.0],
       [0, '', 0.0, 0.0, 0.0, 0.0]], dtype=object)

here is how I did it:

import numpy as np
import pandas as pd
import json
from box import Box

dt = Box(json.loads(txt))
df = pd.DataFrame(dt.src)
ar = np.array(df)

if you print the df.head() you will get:

  id  tag x   y   z   activity
0 0   ''  0.0 0.0 0.0 0.0
1 0   ''  0.0 0.0 0.0 0.0
2 0   ''  0.0 0.0 0.0 0.0
3 0   ''  0.0 0.0 0.0 0.0

if there a more simple way to do, please let me know.

Solution 4:[4]

Figured out a pretty simple case when you want the entire dataframe to be turned into a list of json objects representing a row

import json
import pandas as pd
df = pd.DataFrame([{'col1':1,'col2':2},{'col1':3,'col2':4}])
json_list = json.loads(df.to_json(orient='records'))

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Ali Adhami
Solution 2 stinodego
Solution 3 Walid Bousseta
Solution 4 Inwon Kang