'Discard 200 random healthy instances

Discard 200 random healthy instances. How do I implement this in Rstudio?

This is the data frame:

https://www.kaggle.com/code/jamaltariqcheema/model-performance-and-comparison/data

I tried this but I got an error.

kidney_disease$hd <- ifelse(test=kidney_disease$hd == 0, yes="Healthy", no="Unhealthy")


Solution 1:[1]

Maybe the following solves the question's problem.
Choose row numbers at random with sample, assign a default value "Healthy" to the new column hd and assign the value "Unhealthy" to the randomly chosen rows.

set.seed(2022)   # Make results reproducible

i <- sample(nrow(kidney_disease), 200)
kidney_disease$hd <- "Healthy"
kidney_disease$hd[i] <- "Unhealthy"

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1