'Creating session number based on 'to' and 'from' character values

This seems silly but giving me trouble. I have a data frame with multiple columns, one of them is experiment events (which are character strings), for example

Start
resp
lick
End
test
test
Start
lick
resp
lick
End

I want to add a column representing a session number that should start with the character "Start", and end with the character "End". These values should be consecutive, so the output would be

    Start 1
    resp  1
    lick  1
    End   1   
    test  0
    test  0
    Start 2
    lick  2
    resp  2
    lick  2
    End   2

Any easy ways to do this? I am sure tidyverse can do it, I am just not seeing it! Thank you!



Solution 1:[1]

This is a really interesting question, and harder than I thought. One way to do it is to first get the position of the rows that are between "Start" and "End" (using sequence and which), and then assign an ID to those rows, one for every sequence. This can be done in the following way:

start <- which(dat$col1 == "Start")
end   <- which(dat$col1 == "End")
(s <- sequence(end - start + 1, start))
# [1]  1  2  3  4  7  8  9 10 11 

dat[s,"col2"] <- cumsum(c(1, diff(s) != 1))
dat[is.na(dat$col2),"col2"] <- 0

output

dat
    col1 col2
1  Start    1
2   resp    1
3   lick    1
4    End    1
5   test    0
6   test    0
7  Start    2
8   lick    2
9   resp    2
10  lick    2
11   End    2

data

dat <- structure(list(Start = c("resp", "lick", "End", "test", "test", 
"Start", "lick", "resp", "lick", "End")), class = "data.frame", row.names = c(NA, 
-10L))

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1