'Recursively run a function over a dataframe in R
I have a data frame which looks like this:
df
ID1 ID2
1 ID3000135 ID7510682
2 ID3000468 ID4616306
3 ID3000468 ID5449818
4 ID3000618 ID8320544
5 ID3000900 ID4704654
6 ID3000900 ID7515654
7 ID3001020 ID2661458
8 ID3001047 ID10312344
9 ID3000135 ID2820432
10 ID3001101 ID3000468
df <- structure(list(ID1 = c("ID3000135", "ID3000468", "ID3000468",
"ID3000618", "ID3000900", "ID3000900", "ID3001020", "ID3001047",
"ID3000135", "ID3001101"), ID2 = c("ID7510682", "ID4616306",
"ID5449818", "ID8320544", "ID4704654", "ID7515654", "ID2661458",
"ID10312344", "ID2820432", "ID3000468")), row.names = c(NA, 10L
), class = "data.frame")
I have made a function that I want to run recursively over the data frame, which will randomly select one ID from the first row to remove, and will then remove all rows containing that ID (while keeping a note of that ID in another vector). I want this function to run until there are no rows left in the data frame.
all_samps_rem <- NULL
# Function to run recursively:
samp_num <- sample(1:2, 1)
samp_rem <- df[1,samp_num]
all_samps_rem <- rbind(all_samps_rem, samp_rem)
df <- dplyr::filter(df,
!ID1 == samp_rem,
!ID2 == samp_rem)
How do I keep running this function over df until there are no rows left? Please note that IDs may be listed more than once in either column of the data frame.
Solution 1:[1]
Based on the comment from akrun above, I used this solution:
rem_IDs <- function(data) {
set.seed(6)
all_samps_rem <- NULL
while(TRUE) {
if(nrow(data) == 0) {break}
v1 <- sample(c(data$ID1, data$ID2), 1)
all_samps_rem <- rbind(all_samps_rem, v1)
data <- data %>%
filter(!if_any(ID1:ID2, ~ .x %in% v1))
}
all_samps_rem
}
inds_to_remove <- rem_IDs(df)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | icedcoffee |
