'How can I remove duplicate values in different columns of R dataframe?

I would like a dataframe that removes duplicate values on a column-by-column based.

I attach an example where I would like to select the values in C1 which are not repeated in C3 and C4 and keep the whole row. So that:

Row 1 is deleted because "a" appears in row 3 of C3.
Row 2 is deleted because "b" appears in row 1 of C3.
Row 3 is not deleted because there is no "c" in C3 or C4.
Row 4 is deleted because 'd' appears in rows 2 and 3 of C4.
Row 5 is not deleted because there is no "e" in C3 or C4.

How can I do this? Thank you very much.

df <- data.frame(
  "C1" = c("a", "b", "c", "d", "e"), 
  "C2" = c(1.2, 3.4, 4.5, 5.6, 7.8),
  "C3" = c("b", "b", "a", "d", "f"),
  "C4" = c("a","d","d","a", "g")) 


##   C1  C2 C3 C4
## 1  a 1.2  b  a
## 2  b 3.4  b  d
## 3  c 4.5  a  d
## 4  d 5.6  d  a
## 5  e 7.8  f  g

df_final <- data.frame(
  "C1" = c("c", "e"),
  "C2" = c(4.5, 7.8),
  "C3" = c("a", "f"),
  "c4" = c("f", "g"))

##   C1  C2 C3 c4
## 1  c 4.5  a  f
## 2  e 7.8  f  g

Solution 1:^[1]

library(dplyr)

df <- data.frame(
  "C1" = c("a", "b", "c", "d", "e"), 
  "C2" = c(1.2, 3.4, 4.5, 5.6, 7.8),
  "C3" = c("b", "b", "a", "d", "f"),
  "C4" = c("a","d","d","a", "g"))

df |>
  filter(!C1 %in% union(C3, C4))

##> +   C1  C2 C3 C4
##> 1  c 4.5  a  d
##> 2  e 7.8  f  g

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	Stefano Barbi

'How can I remove duplicate values in different columns of R dataframe?

Solution 1:[1]

Sources

Related Questions

Solution 1:^[1]