'Remove rows in r dataframe using 2 linked conditions (a combination of values) in other dataframe

I have two different datasets in R (MRE below).

One contains one log per module visit (ModuleViews) and the other (PageViews) logs each specific page visit within a module visit.

The columns moduleid contain the same module code, session_id contain the same session code both datasets.

I processed the PageViews dataset and I would now like to update the ModuleViews dataset accordingly.

For this I need R to check / match both the moduleid AND session_id rows. Because in 1 session (e.g. 25) a user can visit several modules (for the case of session 25 modules 1697, 1698 and 1755).

In this case my processing removed all page views of session 25, module 1697.

I now want to remove this row (and all other rows) from the ModuleViews dataset where moduleid and session_id are not the same as in the PageViews dataset.

I tried the following 3 ways:

ModuleViews <- subset(ModuleViews, ModuleViews$session_id %in% PageViews$session_id & 
                         ModuleViews$moduleid %in% PageViews$moduleid)

ModuleViews <- ModuleViews[(ModuleViews$session_id %in% PageViews$session_id) && 
                         (ModuleViews$moduleid %in% PageViews$moduleid),]

ModuleViews$moduleid <- ifelse((ModuleViews$session_id %in% PageViews$session_id) & 
                         (ModuleViews$moduleid %in% PageViews$moduleid), ModuleViews$moduleid, NA) 

But it does not look at both the columns in combination, but rather separately, leaving session 25 module 1697 in the output.

I tried these both with %in% as well as ==, but with == I get a length error (obviously due to the different dataset lengths)

Error: Must subset rows with a valid subscript vector. ℹ Logical subscripts must match the size of the indexed input. x Input has size 220099 but subscript r has size 2024529.

How can I achieve that it looks at both conditions per row?

TIA!

ModuleViews:

structure(list(session_id = c(19L, 19L, 24L, 25L, 25L, 25L, 28L
), moduleid = c(397L, 902L, 690L, 1697L, 1698L, 1755L, 1271L), 
    numslidesread = c(1L, 1L, 31L, 2L, 31L, 44L, 3L), totalsecondsspent = c(5L, 
    13L, 5829L, 10955L, 6942L, 9725L, 667L)), row.names = c(NA, 
-7L), class = c("tbl_df", "tbl", "data.frame"))

PageViews:

structure(list(session_id = c(19L, 19L, 24L, 24L, 24L, 24L, 24L, 
24L, 24L, 24L, 24L, 24L, 24L, 24L, 24L, 24L, 24L, 24L, 24L, 24L, 
24L, 24L, 24L, 24L, 24L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 
25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 
25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 
25L), slideitem_id = c(19974L, 53092L, 37143L, 37004L, 37061L, 
37055L, 37061L, 37062L, 37073L, 37079L, 37079L, 37080L, 37097L, 
37124L, 37131L, 37136L, 37138L, 37143L, 37143L, 37144L, 37145L, 
37170L, 65628L, 37191L, 37192L, 85817L, 85818L, 85819L, 85820L, 
85821L, 85821L, 85822L, 85823L, 85824L, 85825L, 85826L, 85827L, 
85828L, 85829L, 85828L, 85829L, 85830L, 85831L, 85832L, 85833L, 
85834L, 85835L, 85836L, 85837L, 85838L, 85839L, 85840L, 85841L, 
85842L, 85624L, 85234L, 85235L, 85607L, 85614L, 85619L), moduleid = c(397L, 
902L, 690L, 690L, 690L, 690L, 690L, 690L, 690L, 690L, 690L, 690L, 
690L, 690L, 690L, 690L, 690L, 690L, 690L, 690L, 690L, 690L, 690L, 
690L, 690L, 1698L, 1698L, 1698L, 1698L, 1698L, 1698L, 1698L, 
1698L, 1698L, 1698L, 1698L, 1698L, 1698L, 1698L, 1698L, 1698L, 
1698L, 1698L, 1698L, 1698L, 1698L, 1698L, 1698L, 1698L, 1698L, 
1698L, 1698L, 1698L, 1698L, 1755L, 1755L, 1755L, 1755L, 1755L, 
1755L), secondsspentonslide = c(5L, 13L, 154L, 9L, 5L, 9L, 248L, 
17L, 385L, 209L, 364L, 61L, 81L, 175L, 45L, 352L, 23L, 216L, 
35L, 227L, 80L, 375L, 7L, 3L, 3L, 21L, 8L, 43L, 211L, 61L, 37L, 
58L, 50L, 96L, 67L, 36L, 21L, 11L, 3L, 7L, 96L, 66L, 9L, 79L, 
180L, 144L, 127L, 168L, 22L, 49L, 22L, 51L, 127L, 33L, 19L, 5L, 
25L, 73L, 7L, 15L)), row.names = c(NA, -60L), class = c("tbl_df", 
"tbl", "data.frame"))


Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source