'Create multi-level nested json from PySpark dataframe in databricks and save it as table

I have a dataframe of following format (where int values in day column represent activitycount)

+----+-------+-------+----+-------+-------+----+-------+-------+-------+-------+-------+-------+
| state    | city          | email             | Mon | Tue | Wed | Thu | Fri | Sat | Sun |
+----+-------+-------++----+-------+-------++----+-------+-------+-------+-------+-------+-------+
| New York | New York City | [email protected] | 10  | 15  | null| 8   | null | 9  | 22  |
| New York | Mahattan      | [email protected] | 11  | 13  | null| 2   | null | 7  | 12  |
| California|San francisco | [email protected]   | 11  | 13  | null| 2   | null | 7  | 12  |
| California|San francisco | [email protected] | 11  | 13  | null| 2   | null | 7  | 12  |
+----+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+

How I can create json of following format for each email for each day and insert it in following dataframe and save it as table

+----+-------+-------+----+-------+-------+----+-------+-------+-------+-------+-------+-------+
email             | Mon | Tue | Wed | Thu | Fri | Sat | Sun |
+----+-------+-------++----+-------+-------++----+-------+-------+-------+-------+-------+-------+
[email protected] | json obj| json obj | json obj |
[email protected]   | json obj| json obj | json obj |

json obj format is below

enter image description here



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source