Category "statistics"

why a specific model is not appropriate, given a data with 6 variables (they are chr variables)

i want to show why a specific model is not appropriate, given a data with 6 variables (they are chr variables) the model is y= abc*(x1+x2) a and b from the data

How to test symmetry of distribution in Python?

Given data I want to test symmetry of their distribution. In R is function symmetry.test(..) https://www.rdocumentation.org/packages/lawstat/versions/3.4/topics

Issue with corr.test() results

I am running corr.test() to look at potential correlations between genes and bacteria in a dataframe using this code: spearman=cor.test(FullSet$counts.Bac, Full

P value and critical value in hypothesis testing

I need little clarification in p value and critical value approach in hypothesis testing regarding below example. Null Hypothesis : population mean = 80 Altern

How do I determine the likelihood of my data coming from a model distribution using Julia?

I am trying to do a statistical analysis in Julia on experimental data. I tried to create a model and use Turing to obtain distributions for the mean and standa

Can I include covariates outside of the minimally sufficient set in a causal framework that aren't in the causal pathway?

I am applying a causal method to a cohort study analysis on pollutant exposure and disease X. Based on our understanding of the disease, we believe that aging i

How to combine countpct and binomCI into the same summary statistic to be used in tableby function?

I'm using the tableby function from the arsenal package to create summary tables. For most of the statistics I need to generate, this package gives me exactly t

Difference in R and SPSS LMM output

I am working on a linear mixed model and am attempting to run one on the same data in r and spss. I'm using a treatment with two levels, looking at 10 different

How can I find the mode (a number) of a kde histogram in python

I want to determine the X value that has the highest pick in the histogram. The code to print the histogram: fig=sns.displot(data=df, x='degrees', hue="TYPE", k

Generate underlying distribution from bins in python

I found a PDF document describing the income distribution in the US in 1978. Per income range I have the percentage of the population that falls in that income

perl: Finding mean and variance of large numbers without overflow

I am using a subroutine (stats) to calculate statistics for a list of numbers. These numbers may be big enough to lose precision if stored as normal perl number

How to perform a Levene's test using scipy

I've been trying to use scipy.stats.levene with no success. I have a numpy matrix with shape (2128, 45100). Each row is a sample and belongs to one of 3 cluste

How to run cor.test() on two different dataframes

I would like to run cor.test() on two seperate dataframes but I am unsure how to proceed. I have two example dataframes with identical columns (patients) but di

returning cov and std from sklearn gaussian process?

I can return the covariance or the standard deviation from a GP using sklearn, like: y, cov = gp.predict(Xpredict,return_cov=True) y, std = gp.predict(Xpredict,

Strange statistics in Google Play Developer Console

Today I noticed strange statistics in my Google Play Developer Console in one of my application It is about Final installs on active devices: 17 July - th

How can I apply fisher.test in R to a large matrix, and extract p-values to a new matrix?

I have a large matrix (12 rows, 53 columns) with counts of how many times genes in my clusters "A", "B", "C", etc. overlap with clusters created by someone else

error with SAlib library for Sensitivity analysis in python

I am trying to perform sensitivity analysis using Sobol`s method. I always get an error which i can not solve. the code and the result are below. the input vari

Python : How to interpret the result of logistic regression by sm.Logit

When I run a logistic regression by sm.Logit (in the statsmodel library), part of the result is like this: Pseudo R-squ.: 0.4335 Log-Likeliho

replacing missing values in r

I need help in replacing missing values in the following dummy file. The following rule need to be followed when replacing a missing value. If the value is the

BlueSky Statistics Hanging

When I try to start BlueSky Statistics, sometimes the application hangs and the "Starting BlueSky Statistics" box remains on the screen. I see the app open in