'Recursively run a function over a dataframe in R

I have a data frame which looks like this:

df
         ID1        ID2
1  ID3000135  ID7510682
2  ID3000468  ID4616306
3  ID3000468  ID5449818
4  ID3000618  ID8320544
5  ID3000900  ID4704654
6  ID3000900  ID7515654
7  ID3001020  ID2661458
8  ID3001047 ID10312344
9  ID3000135  ID2820432
10 ID3001101  ID3000468

df <- structure(list(ID1 = c("ID3000135", "ID3000468", "ID3000468", 
"ID3000618", "ID3000900", "ID3000900", "ID3001020", "ID3001047", 
"ID3000135", "ID3001101"), ID2 = c("ID7510682", "ID4616306", 
"ID5449818", "ID8320544", "ID4704654", "ID7515654", "ID2661458", 
"ID10312344", "ID2820432", "ID3000468")), row.names = c(NA, 10L
), class = "data.frame")

I have made a function that I want to run recursively over the data frame, which will randomly select one ID from the first row to remove, and will then remove all rows containing that ID (while keeping a note of that ID in another vector). I want this function to run until there are no rows left in the data frame.

all_samps_rem <- NULL

# Function to run recursively:
samp_num <- sample(1:2, 1)
samp_rem <- df[1,samp_num]

all_samps_rem <- rbind(all_samps_rem, samp_rem)

df <- dplyr::filter(df,
        !ID1 == samp_rem,
        !ID2 == samp_rem)

How do I keep running this function over df until there are no rows left? Please note that IDs may be listed more than once in either column of the data frame.

r


Solution 1:[1]

Based on the comment from akrun above, I used this solution:

rem_IDs <- function(data) {

set.seed(6)
all_samps_rem <- NULL

while(TRUE) {
    if(nrow(data) == 0) {break}
    v1 <- sample(c(data$ID1, data$ID2), 1)
    all_samps_rem <- rbind(all_samps_rem, v1)
    data <- data %>%
        filter(!if_any(ID1:ID2, ~ .x %in% v1))
}
    all_samps_rem
}
        
inds_to_remove <- rem_IDs(df)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 icedcoffee