'Data frame, maybe a confusion matrix

I need to make a image like below: each cell contains the amount of occurrences of the pair (X,Y)

Someone said to me it's a confusion matrix, but I'm note sure.

It use two columns of a data frame (X,Y), each cell of the graph show the number os lines that each combination of values have appeared.

import pandas as pd

data = {'x': [1, 2, 3, 1, 2, 1], 'y': [1, 2, 3, 1, 2, 2]}  

# create DataFrame  
df = pd.DataFrame(data)

#count how many times each tuple (x,y) happens and pu the value in n
occurrence= df.groupby(['x', 'y']).size().sort_values(ascending=False)

occurrence_df=occurrence.to_frame() 

occurrence_df.reset_index(inplace=True) 

occurrence_df.columns = [ 'x','y','n'] #name the columns

this code is a simplification of the ideia, but if we add all the numbers on each cell, we can see that the Df is huge. I already tried to use df_confusion_matrix = pd.crosstab(df["y"], df["x"]) but for I think it don't work for a big dataframe, with a lot of other columns.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source