'How to group a dataframe and then put two columns into one json like column in Python
The data i have is something similar to this:
| country | population | area | city | city_population |
|---|---|---|---|---|
| USA | 331893745 | 9833520 | New York | 8804190 |
| USA | 331893745 | 9833520 | Los Angeles | 3898747 |
| USA | 331893745 | 9833520 | Chicago | 2746388 |
| UK | 243610 | 66366000 | London | 7556900 |
| UK | 243610 | 66366000 | Birmingham | 984333 |
| Canada | 9984670 | 38532853 | Toronto | 2600000 |
| Canada | 9984670 | 38532853 | Montreal | 1600000 |
| Canada | 9984670 | 38532853 | Calgary | 1019942 |
I am looking for output like this:
| country | population | area | cities |
|---|---|---|---|
| USA | 331893745 | 9833520 | {'New York' : 8804190, 'Los Angeles' : 3898747, 'Chicago' : 2746388} |
| UK | 243610 | 66366000 | {'London' : 7556900, 'Birmingham' : 984333} |
| Canada | 9984670 | 38532853 | {'Toronto' : 2600000, 'Montreal' : 1600000, 'Calgary' : 1019942} |
So basically I want to group by the country column and then put city and city_population into a JSON-like column while keeping the other columns.
Any help is appreciated.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
