'assign value from one vector to a group based on another vector in R

I have 2 vectors:

v1 = c(4, 8, 13)
v2 = 1:20

I want to classify each element in v2 to a specific group depending on the element comparison. In this example, values 1:4 from v2 will be classed to group1, values 5:8 to group2, values 9:13 to group3, and values 14:20 to group4.

I would like to have a final df like

enter image description here

Any suggestions

r


Solution 1:[1]

We may use %in% with cumsum to create the index and paste 'group'

data.frame(ls2 = v2, group = paste0("group", cumsum(v2 %in% (v1 + 1)) + 1))

-output

 ls2  group
1    1 group1
2    2 group1
3    3 group1
4    4 group1
5    5 group2
6    6 group2
7    7 group2
8    8 group2
9    9 group3
10  10 group3
11  11 group3
12  12 group3
13  13 group3
14  14 group4
15  15 group4
16  16 group4
17  17 group4
18  18 group4
19  19 group4
20  20 group4

Solution 2:[2]

Could you use findInterval?

data.frame(
  v2,
  group = paste0("group", findInterval(v2, v1, left.open = T) + 1)
)

Output

   v2  group
1   1 group1
2   2 group1
3   3 group1
4   4 group1
5   5 group2
6   6 group2
7   7 group2
8   8 group2
9   9 group3
10 10 group3
11 11 group3
12 12 group3
13 13 group3
14 14 group4
15 15 group4
16 16 group4
17 17 group4
18 18 group4
19 19 group4
20 20 group4

Solution 3:[3]

There are likely more universal/elegant solutions, but to recreate the desired dataset given the example, you could try:


group <- case_when(v2 <= v1[1] ~ "group1",
                   v2 %in% (v1[1]+1):v1[2] ~"group2",
                   v2 %in% (v1[2]+1):v1[3] ~"group3",
                   v2 > v1[3] ~"group4"
)

cbind(v2, group)
# v2   group   
# [1,] "1"  "group1"
# [2,] "2"  "group1"
# [3,] "3"  "group1"
# [4,] "4"  "group1"
# [5,] "5"  "group2"
# [6,] "6"  "group2"

Solution 4:[4]

Another possible solution:

v1 = c(4, 8, 13)
v2 = 1:20

data.frame(v2, groups = cut(v2, breaks = c(0,v1,20),
           labels = paste0("group", 1:(1+length(v1)))))

#>    v2 groups
#> 1   1 group1
#> 2   2 group1
#> 3   3 group1
#> 4   4 group1
#> 5   5 group2
#> 6   6 group2
#> 7   7 group2
#> 8   8 group2
#> 9   9 group3
#> 10 10 group3
#> 11 11 group3
#> 12 12 group3
#> 13 13 group3
#> 14 14 group4
#> 15 15 group4
#> 16 16 group4
#> 17 17 group4
#> 18 18 group4
#> 19 19 group4
#> 20 20 group4

Solution 5:[5]

A straightforward approach is:

id = seq_len(length(v1) + 1)
n = diff(c(0, v1, length(v2)))

rep(id, n)
#[1] 1 1 1 1 2 2 2 2 3 3 3 3 3 4 4 4 4 4 4 4

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 akrun
Solution 2 Ben
Solution 3
Solution 4
Solution 5 alexis_laz