'Get number of entries between two words
Hi I can't find a solution to prepare my data.
At the moment I have two Vectors like below and would like to match those to the corresponding titles in an extra column. Which Names match with which title is indicates by an entry in the name data called "Top"
name <- c("Top", "Name1", "Name2", "Top", "Name3", "Top", "Name4", "Name5")
title <- c("Title1", "Title2", "Title3")
What the result should look like:
Title1 Name1
Title1 Name2
Title2 Name3
Title3 Name4
Title3 Name5
An idea is to get the number of names between the "Top" entries and replicate the titles according to those numbers but how can I do that?
Solution 1:[1]
We could use an if statement:
- group by Top beginning rows
- ungroup and apply an if condition using
str_detectwhich detects the number intitleand checks if it is in id - if true
pasteTitle + id - finally filter to remove
Toprows
library(dplyr)
library(stringr)
df %>%
group_by(id = cumsum(name=="Top")) %>%
ungroup() %>%
mutate(Title = if(str_detect(title, "\\d+") %in% id) {paste0("Title",id)}) %>%
filter(str_detect(name, "Name"))
Title name
<chr> <chr>
1 Title1 Name1
2 Title1 Name2
3 Title2 Name3
4 Title3 Name4
5 Title3 Name5
Solution 2:[2]
In base R:
g <- gsub('[0-9]+', '', name)
s <- rle(g)
data.frame(Title = rep(title, s$l[s$v == "Name"]),
Name = name[g == "Name"])
Title Name
1 Title1 Name1
2 Title1 Name2
3 Title2 Name3
4 Title3 Name4
5 Title3 Name5
Solution 3:[3]
We can use base R using cumsum
i1 <- name == 'Top'
setNames(stack(setNames(split(name[!i1], cumsum(i1)[!i1]), title))[2:1],
c("Title", "Name"))
Title Name
1 Title1 Name1
2 Title1 Name2
3 Title2 Name3
4 Title3 Name4
5 Title3 Name5
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | Maël |
| Solution 3 | akrun |
