'MCAR Little's test in Python

How can I execute Little's Test, to find MCAR in Python? I have looked at the R package for the same test, but I want to do it in Python. Is there an alternate approach to test MCAR?



Solution 1:[1]

You can use rpy2 to get the mcar test from R. Note that using rpy2 requires some R coding.

Set up rpy2 in Google Colab

# rpy2 libraries
import rpy2.robjects as robjects
from rpy2.robjects.packages import importr
from rpy2.robjects import pandas2ri
from rpy2.robjects import globalenv
# Import R's base package
base = importr("base")

# Import R's utility packages
utils = importr("utils")

# Select mirror 
utils.chooseCRANmirror(ind=1)

# For automatic translation of Pandas objects to R
pandas2ri.activate()

# Enable R magic
%load_ext rpy2.ipython

# Make your Pandas dataframe accessible to R
globalenv["r_df"] = df

You can now get R functionality within your Python environment by using R magics. Use %R for a single line of R code and %%R when the whole cell should be interpreted as R code.

To install an R package use: utils.install_packages("package_name")

You may also need to load it before it can be used: %R library(package_name)

For the Little's MCAR test, we should install the naniar package. Its installation is slightly more complicated as we also need to install remotes to download it from github, but for other packages the general procedure should be enough.

utils.install_packages("remotes")
%R remotes::install_github("njtierney/naniar")

Load naniar package:

%R library(naniar)

Pass your r_df to the mcar_test function:

# mcar_test on whole df
%R mcar_test(r_df)

If an error occurs, try including only the columns with missing data:

%%R
# mcar_test on columns with missing data
r_dfMissing <- r_df[c("col1", "col2", "col3")]
mcar_test(r_dfMissing)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1