'Get number of entries between two words

Hi I can't find a solution to prepare my data.

At the moment I have two Vectors like below and would like to match those to the corresponding titles in an extra column. Which Names match with which title is indicates by an entry in the name data called "Top"

name <- c("Top", "Name1", "Name2", "Top", "Name3", "Top", "Name4", "Name5")
title <- c("Title1", "Title2", "Title3")

What the result should look like:

Title1   Name1
Title1   Name2
Title2   Name3
Title3   Name4
Title3   Name5

An idea is to get the number of names between the "Top" entries and replicate the titles according to those numbers but how can I do that?

r


Solution 1:[1]

We could use an if statement:

  1. group by Top beginning rows
  2. ungroup and apply an if condition using str_detect which detects the number in title and checks if it is in id
  3. if true paste Title + id
  4. finally filter to remove Top rows
library(dplyr)
library(stringr)
df %>% 
  group_by(id = cumsum(name=="Top")) %>% 
  ungroup() %>% 
  mutate(Title = if(str_detect(title, "\\d+") %in% id) {paste0("Title",id)}) %>% 
  filter(str_detect(name, "Name"))

 Title  name 
  <chr>  <chr>
1 Title1 Name1
2 Title1 Name2
3 Title2 Name3
4 Title3 Name4
5 Title3 Name5

Solution 2:[2]

In base R:

g <- gsub('[0-9]+', '', name)
s <- rle(g)

data.frame(Title = rep(title, s$l[s$v == "Name"]),
           Name = name[g == "Name"])

   Title  Name
1 Title1 Name1
2 Title1 Name2
3 Title2 Name3
4 Title3 Name4
5 Title3 Name5

Solution 3:[3]

We can use base R using cumsum

i1 <- name == 'Top'
setNames(stack(setNames(split(name[!i1], cumsum(i1)[!i1]), title))[2:1], 
    c("Title", "Name"))
   Title  Name
1 Title1 Name1
2 Title1 Name2
3 Title2 Name3
4 Title3 Name4
5 Title3 Name5

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Maël
Solution 3 akrun