'how to find where the interval of continuous numbers starts and ends?
I have a vector
vec <- c(2, 3, 5, 6, 7, 8, 16, 19, 22, 23, 24)
The continuous numbers are:
c(2, 3)
c(5, 6, 7, 8)
c(22, 23, 24)
So the first vector starts at 2 and ends at 3;
for the second vector starts at 5 and ends at 8;
for the third vector starts at 22 and ends at 24;
There is a function to identify where the continuous numbers starts and ends?
Solution 1:[1]
By using diff
to check the differences between each consecutive value, you can find where the difference is not +1
.
diff(vec)
## [1] 1 2 1 1 1 8 3 3 1 1
c(1, diff(vec)) != 1
## [1] FALSE FALSE TRUE FALSE FALSE FALSE TRUE TRUE TRUE FALSE FALSE
Then use cumsum
to make a group identifier:
cumsum(c(1, diff(vec))!=1)
## [1] 0 0 1 1 1 1 2 3 4 4 4
And use this to split
your data up:
split(vec, cumsum(c(1, diff(vec))!=1))
##$`0`
##[1] 2 3
##
##$`1`
##[1] 5 6 7 8
##
##$`2`
##[1] 16
##
##$`3`
##[1] 19
##
##$`4`
##[1] 22 23 24
Which can be Filter
ed to consecutive values:
Filter(\(x) length(x) > 1, split(vec, cumsum(c(1, diff(vec))!=1)))
##$`0`
##[1] 2 3
##
##$`1`
##[1] 5 6 7 8
##
##$`4`
##[1] 22 23 24
Solution 2:[2]
Another one
vec=c( 2 , 3 , 5 , 6 , 7 , 8 , 16 , 19 , 22 , 23 , 24 )
x <- replace(NA, vec, vec)
# [1] NA 2 3 NA 5 6 7 8 NA NA NA NA NA NA NA 16 NA NA 19 NA NA 22 23 24
l <- split(x, with(rle(is.na(x)), rep(seq.int(length(lengths)), lengths)))
# l <- split(x, data.table::rleid(is.na(x))) ## same as above
l <- Filter(Negate(anyNA), l)
l
# $`2`
# [1] 2 3
#
# $`4`
# [1] 5 6 7 8
#
# $`6`
# [1] 16
#
# $`8`
# [1] 19
#
# $`10`
# [1] 22 23 24
If you have a length requirement:
l[lengths(l) > 1]
# $`2`
# [1] 2 3
#
# $`4`
# [1] 5 6 7 8
#
# $`10`
# [1] 22 23 24
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | thelatemail |
Solution 2 | rawr |