'How to write a method to check independence which returns a dictionary of length 3
I having some difficulty to try to understand the question and I am not very sure how to get a method to returns a dictionary of length 3.
This is the sample table:
| X | Y | pr |
|---|---|---|
| 0 | 1 | 0.30 |
| 0 | 2 | 0.25 |
| 1 | 1 | 0.15 |
| 1 | 2 | 0.30 |
There are total 3 elements needed:
• first element (the key named are_independent) is a boolean which states if X and Y are independent (True) or not (False). Two random variables are independent if for each possible value x for X and for each possible value y for Y. -- already have the solution for this (attached below)
• second element (the key named cov) is a covariance between X and Y (i is an indicator of i-th of n possible pairs (xi, yi) of (X, Y))
• third element (the key named corr) is a correlation coefficient between X and Y
I have some idea on the 1st element and the rest, I am really not very sure about it.
import pandas as pd
import numpy as np
# you can use this table as an example
distr_table = pd.DataFrame({
'X': [0, 0, 1, 1],
'Y': [1, 2, 1, 2],
'pr': [0.3, 0.25, 0.15, 0.3]
})
class CheckIndependence:
def __init__(self):
self.version = 1
def check_independence(self, distr_table: pd.DataFrame):
# write your solution here
distr_table.groupby('Y')['pr'].sum()
distr_table.groupby('X')['pr'].sum()
cmp = pd.merge(distr_table.groupby('X', as_index=False)['pr'].sum(), distr_table.groupby('Y', as_index=False)['pr'].sum(), how='cross')
cmp['indep_pr'] = cmp['pr_x'] * cmp['pr_y']
cmp[['X', 'Y', 'indep_pr']].merge(distr_table, on=['X', 'Y'])
np.allclose(cmp['indep_pr'], distr_table['pr'])
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
