Category "r"

Ordered bar plot with multiple groupings

Example data are here I am struggling to create an ordered and grouped bar plot. Any assistance appreciated. I have found two similar questions that essentially

Keras installation failed with Rstudio RcppTOML had non-zero exit status

I am struggling to install keras on my Rstudio version 2021.09.2 Build 382 (R version 3.6.0 (2019-04-26)) on Linux Centos 7. I am having this error message: ERR

ggplot2, arrange multiple plots, all the same size, no gaps in between

I would like to arrange multiple plots into one figure, without any gaps between the plot area, and all plots being exactly the same size (see image below for a

formatting tables in xlsx files

I am trying to create function for formatting every tables in xlsx file. I want to save N numbers of Tables in xlsx and formatting all the tables in xlsx file.

Scatterplot comparing two variables with ggplot and tidy data

With untidy data, running a scatterplot comparing two variables is trivial in either base R or ggplot2. For example, here is a sample scatterplot from R for Dat

Subset data based on variable prefix

I have a large dataset in which the answers to one question are distributed among various columns. However, if the columns belong together, they share the same

jitter according to density

i want to create a combination of violin- and dot-plot with ggplot. The idea is to shift the dots to the left and right if necessary, to avoid overlap. I know t

reading .png files into R and create a multiple plot

I am interested in using the following commands to read png files from my computer and making a multiple from them. plot(0:2, 0:2, type = "n", xaxt = "n", yaxt

Finding median eigenvalue with sparse matrix in r

I am working with SVD on a matrix $$Y_{m,n} = T_{m,m} \Sigma D^T_{n,n} $$ where $T$ and $D$ describe the row and the column entities of Y, respectively. The tru

Boxplot.stats R not identifying outliers

I have used boxplot.stats$out to get outliers of a list in R. However I noticed that many times it fails to identify outliers. For example: list = c(3,4,7,500)

Removing NAs from two columns in data frame a shift up

I have this data frame atac.v1.pbmc.5k.possorted.bam.bam possorted.bam.bam chr1.9941.10736 NA

Fastest way to find nearest value in vector

I have two integer/posixct vectors: a <- c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15) #has > 2 mil elements b <- c(4,6,10,16) # 200000 elements Now my resul

Correlation problems with two variables WITH NA

I have two variables and I want to know if they are correlated, I have them distributed like this: X = 14,15,16,18,12,13,14,15 Y = NA, 13,12, NA, NA, 16,16, NA

How to exclude NA values in lm function (regression)?

I am doing a regression analysis with 70 countries. My dependent variable is 'Inequality' and my independent variable is 'Sanction'. My original columns look as

Multiple regression: R splits Variable into multiple

Hey there i want to explore the effect of Age and Gender on points of a test via mlr. Yet when i type model <- lm(punkte~ Age + Gender, data = df) R gives m

R data.table struggling with conditional subsetting when column name is predefined elsewhere

Let's say I have a data table library(data.table) DT <- data.table(x=c(1,1,0,0),y=c(0,1,2,3)) column_name <- "x" x y 1: 1 0 2: 1 1 3: 0 2 4: 0 3 And

How to use na.omit with ggplot

The image shows the database, it starts with day 0 and ends with day 14. In between these, there are empty values for what I am plotting. I am unable to correc

Warning message: 'newdata' had 20 rows but variables found have 1000 rows

#This is my model linearMod <- lm( Housing_Training$SalePrice ~ Housing_Training$MSSubClass + Housing_Training$LotFrontage + Housing_Training$LotArea + Hous

Creating alternate series in r

I have a list of -0.5, -0.6, 0.7, 1, 1.5, 3, -5 and I would like to sort it as 3, -5, 1.5, -0.6, 1, -0.5, 0.7. In other words, I would like to sparate the list

R Mann-Whitney-U test output like in SPSS

I want to run Mann-Whitney-U test. But R's wilcox.test(x~y, conf.int=TRUE) does not give such statistics as N, Mean Rank, Sum of Ranks, Z-value for both factors