'Data frame, maybe a confusion matrix
I need to make a image like below: each cell contains the amount of occurrences of the pair (X,Y)
Someone said to me it's a confusion matrix, but I'm note sure.
It use two columns of a data frame (X,Y), each cell of the graph show the number os lines that each combination of values have appeared.
import pandas as pd
data = {'x': [1, 2, 3, 1, 2, 1], 'y': [1, 2, 3, 1, 2, 2]}
# create DataFrame
df = pd.DataFrame(data)
#count how many times each tuple (x,y) happens and pu the value in n
occurrence= df.groupby(['x', 'y']).size().sort_values(ascending=False)
occurrence_df=occurrence.to_frame()
occurrence_df.reset_index(inplace=True)
occurrence_df.columns = [ 'x','y','n'] #name the columns
this code is a simplification of the ideia, but if we add all the numbers on each cell, we can see that the Df is huge.
I already tried to use df_confusion_matrix = pd.crosstab(df["y"], df["x"])
but for I think it don't work for a big dataframe, with a lot of other columns.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|