'0 How can I use a statistical test with this dataset? (p-value > 1)

I've tried to have a T-test model for answering one of my questions. To do so, I create a subset data, then applied chi-square test to see whether data is proper for T-test or not. According to the results, p-value shown approximately 3.5, which is impossible. I thought that it could be because of the sample size of the data I specified, and sample size of the dependent variable(I calculate a new column and use it, its size is ~178).

In details: The code I am sharing is for the project's first question (attached the github link: enter link description here )

The dependent variable: Delay & independent: Gender

The code I gave a try:

Subset data

male = df.query('Gender == "0"')['Delay']

female = df.query('Gender == "1"')['Delay']

df.groupby('Gender').describe()

Create contingency table

GD = pd.crosstab(index=df['Gender'], columns=df['Delay'], margins=True)

GD

chi-square test

chiRes = stats.chi2_contingency(GD)

print(f'chi-square statistic: {chiRes[0]}')

print(f'p-value: {chiRes[1]}')

print(f'degree of freedom: {chiRes[2]}')

print('expected contingency table')

print(chiRes[3])

And these are the findings:

chi-square statistic: 519.651581316998

p-value: 3.590660196919681e-19 (?)

degree of freedom: 262 (?)

As a second way, I tried to Shapiro-Wilks test for normality test.

The code (stats.shapiro(male)) does not even run, creates this error:

ValueError: Data must be at least length 3.

Lastly, I checked the T-test as what if it ensure me on some points but it didn't.

rp.ttest(group1= df['Delay'][df['Gender'] == '0'], group1_name= "Male",

group2= df['Delay'][df['Gender'] == '1'], group2_name= "Female")

Output: All of Mean, SD, SE, Conf. Interval came with NaN. (Although I know that the data has no missing value.)

How can I use a statistical test with this dataset? Is there any points you want to mention?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source