'pandas series containing bytes or string data to json or dictionary type
I have a pandas series with bytes datatype that I'd like to transform for manipulation and parsing contents.
import pandas as pd
from ast import literal_eval
df = pd.DataFrame({'id': [0],
'bdata': ["b'{\"status\":\"SuccessWithResult\",\"total\":13}"]
})
type(df['bdata'][0])
bytes
# Transform to dict
df_zillow_az_v2['attom'] = df_zillow_az_v2['attom'].apply(literal_eval)
ValueError: malformed node or string: b'
How do I convert pandas series of type bytes to either json or dict type?
- The data may appear as
strbut it is actually stored asbytesin pandas DataFrame.
Solution 1:[1]
The values of bdata column are not bytes, they are strings as type(df['bdata'][0]) tells you. The b' is misleading. So you have to remove the characters b' from the string before applying literal_eval. You can do it using Series.str.strip
from ast import literal_eval
import pandas as pd
df = pd.DataFrame({'id': [0],
'bdata': ["b'{\"status\":\"SuccessWithResult\",\"total\":13}"]
})
df['bdata'] = df['bdata'].str.strip("b'").apply(literal_eval)
Output:
>>> df['bdata']
0 {'status': 'SuccessWithResult', 'total': 13}
Name: bdata, dtype: object
>>> df['bdata'].apply(type)
0 <class 'dict'>
Name: bdata, dtype: object
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Rodalm |
