'Test for set inclusion and processing data simultaneously in tidyverse
I almost have what I need. I need some help with the last detail! The data set is produced by the following:
stu_vec <- c("A","B","C","D","E","F","G","H","I","J")
college_vec <- c("ATC","CCTC","DTC","FDTC","GTC","NETC", "USC", "Clemson", "Winthrop", "Allen")
sctcs <- c("ATC","CCTC","DTC","FDTC","GTC","NETC")
Student <- sample (stu_vec, size=100,replace=T, prob=c(.08,0.09,0.06,.07,.12,.10,.07,.05,.11,.05))
College <- sample(college_vec, size=100, replace=T,prob=c(.08,.07,.13,.12,.11,.06,.05,.08,.02,.08))
test.dat1 <- as.data.frame(cbind(Student, College))
I am using the following code to create what I need
library(dplyr)
set.seed(29)
test.dat2 <- test.dat1 %>%
group_by(Student, .drop=F) %>% #group by student
mutate(semester= sequence(n())) %>% #set semester sequence
summarise(home_school= College[min(which(College %in% sctcs))], # Find first college in sctcs
seq_home=min(which(College %in% sctcs)), # add column of sequence values
new_school= if_else(n_distinct(College) > 1,
first(College[!(College %in% sctcs) & semester > seq_home]), last(College))) #new_school should be the first non-sctcs school after the sctcs school is found or the last school for that student.
it produces the following table
I want the NA's to be filled in with the last college for that student. I don't know how to get rid of the NA's. If you know an easier way to produce the same thing please share the knowledge.
Solution 1:[1]
It's not clear what you're trying to do. But when [!(College %in% sctcs) & semester > seq_home] returns FALSE, College[!(College %in% sctcs) & semester > seq_home] returns a zero-length character vector, so first(College[!(College %in% sctcs) & semester > seq_home]) returns NA.
When there are no TRUE values in [!(College %in% sctcs) & semester > seq_home], it's because there are no non-sctcs colleges in any of the semesters after semester[seq_home]. If a student transfers from home_school to one or more sctcs schools, but never to any non-sctcs schools, you'll get an NA value.
You're effectively asking the wrong question. I'm not sure what question you're trying to ask, but what you're currently asking is:
What's the first non-sctcs school this student attended after they attended their first sctcs school?
Some students, however, never attend a non-sctcs school after attending their first sctcs school. For this reason, you get an NA response, which is the correct answer to the question.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |

