'Creating a mutliindex heatmap from two dicts and pandas dataframe

I am having two dicts where the different keys are the same for both, but not necessarily in the same order.

DictA = {"Asia": ["Japan", "China", "Laos"], "Europe": ["England", "Sweden"]}
DictB = {"Europe": ["Denmark", "Hungary", "Spain", "Moldova"], "Asia": ["Mongolia", "Thailand"]}

These keys and values point to the columns and rows of a pandas dataframe with values that need to be made into a heatmap.

Df =
Country   Asia    Europe
Japan     3       1 
Sweden    2       2
England   1       4
China     5       9 
Laos      1       9
Denmark   3       1
Mongolia  1       7
Thailand  7       4
Hungary   7       3 
Spain     2       9
Moldova   1       5

What I need to figure out is how to use these dicts to cross reference the pandas dataframe to make a heatmap where the values are coloring the heatmap. The name of the countries should be on the lower axis of the heatmap (but this is not important) and underneath the names of the countries should the continents that country is in (further from the heatmap). So if the names of the countries are on the left side of the heatmap, the names of the continents should be even further to the left to show where each country belongs.

I haven't got the slightest clue how to do this. Any help is greatly appreciated!



Solution 1:[1]

So, you have to build a matrix with the data and pass to the DataFrame constructor.

To build a matrix, first count all continents and countries, for each of them, save into an array, to make the labels and remember and index in the matrix.

Then, build the matrix with countries x continents size and fill with zeros. Then, per each Dict, count the ocurrences of contries in continents.

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
 
DictA = {"Asia": ["Japan", "China", "Laos"], "Europe": ["England", "Sweden"]}
DictB = {"Europe": ["Denmark", "Hungary", "Spain", "Moldova"], "Asia": ["Mongolia", "Thailand"]}


def createDataFrame(dicts):
  continents = []
  countries = []

  for aDict in dicts:
    for continent in aDict:
      if continent not in continents:
        continents.append(continent)

      for country in aDict[continent]:
        if country not in countries:
          countries.append(country)

  continentsLength = len(continents)
  countriesLength = len(countries)

  # countries rows x continents cols
  matrix =  np.zeros((countriesLength, continentsLength)) 

  for aDict in dicts:
    for continent in aDict:
      continentIndex = continents.index(continent)

      for country in aDict[continent]:
        countryIndex = countries.index(country)

        matrix[countryIndex][continentIndex] += 1

  return pd.DataFrame(matrix, columns=continents, index=countries)

df = createDataFrame([DictA, DictB])
 
plt.imshow(df, cmap="YlGnBu")
plt.colorbar()
plt.xticks(range(len(df.columns)),df.columns, rotation=20)
plt.yticks(range(len(df.index)),df.index)
plt.show()

And you have the output:

enter image description here

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Ignacior