Category "dplyr"

Modify a single cell value in dplyr

Let's say I have the following dataset: dat <- read.table(text="id_1 id_2 123 NA 456 NA NA 3

Conditional mutate - creating a new variable with coalesce

I'm scraping data from a website and depending on the structure of the page. I have an inner join in my final table that either joins clean on WON and LOST vari

Creating predicted vs observed confidence interval graph

Hello and thank you for you time and consideration, I'd like to recreate this graph with ggplot. The top blue dots are the predicted values from my fitted model

How to add sequence of numbers to each group of a data.frame?

I have a dataframe of US zipcodes and I want to add a sequence of numbers to each unique zipcode while repeating the rest of the rows. Right now, my data looks

using the uniroot function with dplyr pipes

I'm trying to utilize the uniroot function inside a piping scheme. I have root data by depth, and I fit a model for each crop-year set and put the fitted parame

using the uniroot function with dplyr pipes

I'm trying to utilize the uniroot function inside a piping scheme. I have root data by depth, and I fit a model for each crop-year set and put the fitted parame

Specify order after gather and spread

I want to keep the order of the output variables the same as the order they were created in the mutate statement. How do I accomplish this? It seems to be reor

Coalesce columns and create another column to specify source

I'm using dplyr::coalesce() to combine several columns into one. Originally, across columns, each row has only one column with actual value while the other colu

R Dataframe By Group Calculation

I have a dataframe like below (the real data has many more people and club): Year Player Club 2005 Phelan Chicago Fire 2007 Phelan Boston Pant 2

Automating conditional logic for database data checks in R

I am trying to do a large data check for a database. Some fields in the database are hidden, so when I am doing the datacheck, I need to ignore all hidden field

How to left_join() two datasets but only select specific columns from one of the datasets?

Here are two datasets: (this is fake data) library(tidyverse) myfruit <- tibble(fruit_name = c("apple", "pear", "banana", "cherry"), number

How to look at differences between 2 columns in R

I just need to write some code that will look at the difference between the "est_age" and "known_age" columns in my data set. Then I need to know what percenta

max.col with the value not the index

If I have a matrix: mod_xgb_softprob$pred[1:3,1:3] [,1] [,2] [,3] [1,] 6.781361e-04 6.781361e-04 6.781422e-04 [2,] 2.022457e-07 2.

In dplyr using str_detect and case_when in R

This is my df: mydf <- structure(list(Action = c("Passes accurate", "Passes accurate", "Passes accurate", "Passes accurate", "Lost balls", "Lost balls (in o

Include 'blank' filters in dplyr filter chain in Shiny app

I have a shiny application with numerous user inputs including numericInput and textInput and pickerInput. These inputs are used to filter a dataframe. In my fi

separate_columns for tidyr

Let's say I had a survey question that read: What did you eat? [ ] apple [ ] pear [x] banana [x] grapes Now, I have the endorsed options as comma-separated st

Calculate Stock

Is it possible calculated stock using R? The formula is stock+purchase-sold. In this case first stock (row1) is 0, rg first result stockB1= 12 - 3 = 9 the secon

grouped data frame to list

I've got a data frame that contains names that are grouped, like so: df <- data.frame(group = rep(letters[1:2], each=2), name = LETTERS[1:4

Difference between pull and select in dplyr?

It seems like dplyr::pull() and dplyr::select() do the same thing. Is there a difference besides that dplyr::pull() only selects 1 variable?

R how to group part of overlapped values among rows?

I have a data frame that some rows that need to be further grouped by some of the overlapped values among rows col1, col2 a1, 2;3 a2, 2 a3, 3;4 a4, 4 a