'Chi2 heat map for categorical data to test for MCAR: cannot unpack non-iterable rv_frozen object
I want to understand if my missing data is MCAR or not.
I have a data set like this, where 0 means the data is present and 1 means the data is missing:
a b c d e
0 1 0 0 0
0 0 0 0 0
0 0 0 0 0
0 1 0 0 0
0 0 0 0 0
I want to understand if the data in column B is MCAR, so I want to make a heatmap of chi2 between all the columns (and if all p values >0.5, data can be possibly considered MCAR as far as I understand).
I wrote this:
import pandas as pd
import numpy as np
from scipy.stats import chi2
df = pd.read_csv('binary_to_check_for_missing_data.txt',header=0,sep='\t')
column_names = df.columns
resultant = pd.DataFrame(data=[(0 for i in range(len(df.columns))) for i in range(len(df.columns))],
columns=list(df.columns))
resultant.set_index(pd.Index(list(df.columns)), inplace = True)
for i in list(df.columns):
for j in list(df.columns):
if i != j:
chi2_val, p_val = chi2(np.array(df[i]).reshape(-1, 1), np.array(df[j]).reshape(-1, 1))
resultant.loc[i,j] = p_val
print(resultant)
I get the error:
Traceback (most recent call last):
File "chi2_contingency.py", line 16, in <module>
chi2_val, p_val = chi2(np.array(df[i]).reshape(-1, 1), np.array(df[j]).reshape(-1, 1))
TypeError: cannot unpack non-iterable rv_frozen object
I just don't really understand the error. I'm thinking maybe since the data is category, is it telling me I wasn't meant to turn the data into a np.array?
Solution 1:[1]
Change chi2 to chisquare.
scipy.stats.chi2 is the SciPy implementation of the chi-squared probability distribution. Calling it does not perform the chi-squared test.
The function scipy.stats.chisquare performs the chi-squared test. It returns the chi-squared statistic and p-value that you expect.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
