Category "dplyr"

Dplyr Lags on Summarised Grouped Data

Using dplyr, I'm looking to summarise a new column of data as a lagged version of an existing column of grouped data. Reprex: dateidx <- as.Date(c("2019-

Is there a way to vectorize seq() and grep() to use on conjunction with dplyr?

Apologies if this is obvious, I don't have much experience with R. I have a function contains_leap_year(date1, date2) that I want to pass in as a condition to d

Create several new variables using a vector of names and a vector for computation within dplyr::mutate

I'd like to create several new columns. They should take their names from one vector and they should be computed by taking one column in the data and dividing i

Merging three dfs of different row lengths

I need to merge three separate DFs ("factors_sed", "resp", and "npoc_sed") based on the shared column "Samples". Each DF contains a different number of rows (s

Change values in multiple columns if condition is met (R) [duplicate]

I have a dataframe: n <- 50 df <- data.frame(id = seq (1:n), age = sample(c(20:90), n, rep = TRUE), s

Error when updating a dataframe with new column inside a for loop using Dplyr

I have the following R dataframe df: library(tidyquant) start_date <- as.Date('2022-01-01') end_date <- as.Date('2022-03-31') assets_list <- c('DGS30

Calculate changes in totals of subgroups in R

I have the following dataframe: # A tibble: 8 x 5 Year Group Unit Profit Sales <dbl> <chr> <chr> <dbl> <dbl> 1 2021 One

Why is my dplyr code to create multiple variables using mutate and zoo incredibly slow?

I am using dplyr to create multiple variables in my data frame using mutate. At the same time, I am using zoo to calculate a rolling average. As an example, I h

Dplyr summarize "sum" function works correctly only for subset not the larger dataset in R

I have a dataset where I sampled abundance of 4 species across 12 months, 6 sites (5 replicates within a site). I am trying to calculate various summary stats (

Copy column from one data.frame to another based on index

The problem is similar to what posted in Combine dataframe based on index R I am trying to copy one column from df2 (huge df) to df1 (small df) but based on ind

Lag and lead a variable in a dataframe by 1 month and 6 business days for panel data

I have a large panel data set and I would like to lag and lead a variable by 1 month and 6 business days. I know, for instance, from dplyr there is the lag or

How to filter very small values in r?

I have a large dataset in which one column is p-values that range from 0.9 to being extremely small like 5e-79. In R I can sort the data in descending order and

Use dplyr::select's where with base R grepl and anonymus function

There is a very similar question here: How to select columns based on grep in dplyr::tibble However I think that the select_if was superseeded with select(where

Why doesn't R dplyr arrange sort properly using a vector element within a for loop

I'm having trouble getting r's dplyr::arrange() to sort properly when used in a for loop. I found many posts discussing this issue (like ex.1 with the .by_grou

Joining two datasets by (non-uniform) names

I need to join two datasets and the only identifier in both are the company names. For example: db1 <- tibble( Company = c('Bombardier Inc.','Honeywell Dev

Paste together results within case_when (if-else) statements

I want to paste together results within the same case_when statement (i.e., if multiple statements are true for a given row). I know that I could do something l

dplyr get linear regression coefficients

I'm wondering if there is a better way is to get linear regression coefficients as columns in dplyr. Here is some sample data. mydata <- data.frame( S

rewriting `summarise_all` without deprecated `funs`, using Simple list and Auto-named list

I'm trying to count the number of NA values in each of 2 columns. The code below works. temp2 %>% select(c18basic, c18ipug) %>% summarise_all(funs(sum

How to return the range of values shared between two data frames in R?

I have several data frames that have the same columns names, and ID , the following to are the start from and end to of a range and group label from each of the

Rename several columns using start with in r

I want to rename multiple columns that starts with the same string. However, all the codes I tried did not change the columns. For example this: df %>% renam