Category "dplyr"

Insert rows for missing time measurements of a negative event

I have data from an experiment where Subjects rated an Event called f: df <- structure(list(Subject = c("A", "A", "A", "B", "B", "B"), T

How to avoid matrix/dataframe being piped (%>%) into list as an element in R

I want to create a list of matrices of correlations and covariances from a dataframe. I tried piping the dataframe into the list, using the magrittr pipe operat

How do I filter() over a dataset with map_dfr() more efficiently?

I have a list of word pairs: library(tidyverse) word_pairs <- structure(list(V1 = c("cup", "cup", "cup"), V2 = c("kilo", "slice","bacon")), row.names = c(NA

data column not recognized in the ggplot geom_hline

I was wondering why variable mean_y is not recognized by my geom_hline(yintercept = unique(mean_y)) call? library(tidyverse) set.seed(20) n_groups <- 2 n_in

Calculating Highest In, First Out on trades

I am trying to use the Highest In, First Out accounting method on trades. Highest In, First Out means that when you sell, you sell your most expensive shares fi

Adding a column of totals using dplyr in a dataframe

how would you add a column to this dataset showing the number of individuals of each species?. install.packages("ggplot") library(ggplot) library(ggplot2) star

minimum value in dataframe greater than 0 in R

I have a dataset with ~2500 columns in R, and I am trying to find the minimum value greater than zero from the entire data frame. Once I have found this number,

Is there a way in R to add a row underneath that calculates difference of above rows (tidyr/dplyr)?

I have a really simple question but am not able to figure out at all. animal age cat 12 dog 8 Normally I'd apply data %>% mutate(diff = age[1] - age[2]), b

Remove duplicate among consecutive values within a dataframe in R

I have a datafram such as COL1 COL2 COL3 G1 1 6 G1 2 6 G1 3 7 G1 4 9 G1 5 9 G1 6 9 G1 7 6 G1 8 6 G1 9 7 G1 10 7 G1 11 7 G1 12 8 G1 13 7 and I would like to rem

Fill up missing values based on other entries on R

I have dataset input with a couple of missing values. and I have to create dataset output with the following logic: If there is a missing in any of the columns

Aggregate copublications associated with a primary publication

Each primary_citation may have multiple copublications. I would like to aggregate citation_id's associated with each primary citation. The following code works

Extract single value from function that returns multiple values for use with dplyr() pipe

I have the following data: date_range <- c('2020-01-31', '2020-02-28', '2020-03-31', '2020-04-30', '2020-05-31',

can you use split_cols_by and also get a total column?

I'm making a table like this: basic_table() %>% split_cols_by("ARM") %>% analyze(vars = c("AGE", "BMRKR1"), afun = function(x) { in_rows( "M

Problem when creating a weights column in the table

Running regression with panel data on different geographical levels in the US and Euro area with weights that essentially look like this: lm(log(POP25) ~ log(EM

Overwrite variables if condition is met, else keep existing values R

I have a data frame df<-data.frame(Name=c('H001', 'H002', 'H003', 'H004', 'H005', 'H006', 'H007', 'H008', 'H009', 'H010'),

How to rank a variable in a column based on a conditional, when there are NAs in the column

I have a longitudinal data set with two people in which the rows of data are numbered as 'episodes', and some episodes have a test 'result'. The goal of the bel

How to get the frequency( count) of Variable C when Variables A and B are mentioned together?

I have the following dplyr code: df3 <- Table3%>% group_by(Q6,Q9,Q11) %>% summarise(count = n()) %>% mutate(per = paste0(round(100 *count/sum(

Dplyr Lags on Summarised Grouped Data

Using dplyr, I'm looking to summarise a new column of data as a lagged version of an existing column of grouped data. Reprex: dateidx <- as.Date(c("2019-

Is there a way to vectorize seq() and grep() to use on conjunction with dplyr?

Apologies if this is obvious, I don't have much experience with R. I have a function contains_leap_year(date1, date2) that I want to pass in as a condition to d

Create several new variables using a vector of names and a vector for computation within dplyr::mutate

I'd like to create several new columns. They should take their names from one vector and they should be computed by taking one column in the data and dividing i

Category "dplyr"

Other Categories