'identifying last occurring duplicates in a vector in R

I would like to identify all unique values and last occurring instances of multiple values in a vector. For example, I would like to to identify the positions

c(2,3,4,6,7)

in the vector:

v <- c("m", "m", "k", "r", "l", "o", "l")

I see that

(duplicated(v) | duplicated(v, fromLast = T))

identifies all duplicated values, yet I would like to only identify the last occurring instances of duplicated elements.

How to achieve this without a loop?

r


Solution 1:[1]

You could do something like:

library(dplyr)

v %>% 
  as_tibble() %>% 
  mutate(index = row_number()) %>% 
  group_by(value) %>% 
  mutate(id=row_number()) %>%
  filter(id == max(id))

Which gives us:

# A tibble: 5 × 3
# Groups:   value [5]
  value index    id
  <chr> <int> <int>
1 m         2     2
2 k         3     1
3 r         4     1
4 o         6     1
5 l         7     2

Additionally, if you just want the index, you can do:

v %>% 
  as_tibble() %>% 
  mutate(index = row_number()) %>% 
  group_by(value) %>% 
  mutate(id=row_number()) %>%
  filter(id == max(id)) %>%
  pull(index)

...to get:

[1] 2 3 4 6 7

Solution 2:[2]

We can try

> sort(tapply(seq_along(v), v, max))
m k r o l 
2 3 4 6 7

or

> unique(ave(seq_along(v), v, FUN = max))
[1] 2 3 4 7 6

or

> rev(length(v) - which(!duplicated(rev(v))) + 1)
[1] 2 3 4 6 7

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Matt
Solution 2 ThomasIsCoding