'Convert a MultiIndex pandas DataFrame to a nested JSON

I have the following Dataframe with MultiIndex rows in pandas.

               time    available_slots       status
month day                                          
1     1    10:00:00                  1    AVAILABLE
      1    12:00:00                  1    AVAILABLE
      1    14:00:00                  1    AVAILABLE
      1    16:00:00                  1    AVAILABLE
      1    18:00:00                  1    AVAILABLE
      2    10:00:00                  1    AVAILABLE
...             ...                ...          ...
2     28   12:00:00                  1    AVAILABLE
      28   14:00:00                  1    AVAILABLE
      28   16:00:00                  1    AVAILABLE
      28   18:00:00                  1    AVAILABLE
      28   20:00:00                  1    AVAILABLE

And I need to transform it to a hierarchical nested JSON as this:

[
    {
        "month": 1,
        "days": [
            {
                "day": 1,
                "slots": [
                    {
                        "time": "10:00:00",
                        "available_slots": 1,
                        "status": "AVAILABLE"
                    },
                    {
                        "time": "12:00:00",
                        "available_slots": 1,
                        "status": "AVAILABLE"
                    },
                    ...
                ]
            },
            {
                "day": 2,
                "slots": [
                    ...
                ]
            }
        ]
    },
    {
        "month": 2,
        "days":[
            {
                "day": 1,
                "slots": [
                    ...
                ]
            }
        ]
    },
    ...
]

Unfortunately, it is not as easy as doing df.to_json(orient="index").

Does anyone know if there is a method in pandas to perform this kind of transformations? or in what way I could iterate over the DataFrame to build the final object?



Solution 1:[1]

You can use a double loop for each level of your index:

data = []
for month, df1 in df.groupby(level=0):
    data.append({'month': month, 'days': []})
    for day, df2 in df1.groupby(level=1):
        data[-1]['days'].append({'day': day, 'slots': df2.to_dict('records')})

Output:

import json
print(json.dumps(data, indent=2))

[
  {
    "month": 1,
    "days": [
      {
        "day": 1,
        "slots": [
          {
            "time": "10:00:00",
            "available_slots": 1,
            "status": "AVAILABLE"
          },
          {
            "time": "12:00:00",
            "available_slots": 1,
            "status": "AVAILABLE"
          },
          {
            "time": "14:00:00",
            "available_slots": 1,
            "status": "AVAILABLE"
          },
          {
            "time": "16:00:00",
            "available_slots": 1,
            "status": "AVAILABLE"
          },
          {
            "time": "18:00:00",
            "available_slots": 1,
            "status": "AVAILABLE"
          }
        ]
      },
      {
        "day": 2,
        "slots": [
          {
            "time": "10:00:00",
            "available_slots": 1,
            "status": "AVAILABLE"
          }
        ]
      }
    ]
  },
  {
    "month": 2,
    "days": [
      {
        "day": 28,
        "slots": [
          {
            "time": "12:00:00",
            "available_slots": 1,
            "status": "AVAILABLE"
          },
          {
            "time": "14:00:00",
            "available_slots": 1,
            "status": "AVAILABLE"
          },
          {
            "time": "18:00:00",
            "available_slots": 1,
            "status": "AVAILABLE"
          },
          {
            "time": "20:00:00",
            "available_slots": 1,
            "status": "AVAILABLE"
          }
        ]
      }
    ]
  }
]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1