'Simulate "complex" data
I would like to simulate a data set in R, in which I would like to make the following assumption:
- n=100 -> 50 subjects in each group
- binary variable for gender (probability for female should be 40%)
- and a normally distributed variable
N(mu, 10), where mu is assumed to be 24 for women and 22 for men.
The simulation of independent variables seems intuitive, but with the normally distributed variable, which depends on gender is not clear to me. Does somebody have any idea?
Some code to use:
set.seed(1234)
n <- 100
x1 <- rbinom(n,1, 0.5) # trt
x2 <- rbinom(n,1, 0.4) #40% woman
data.frame(x1,x2)
Solution 1:[1]
rnorm() takes a vectorized mean value, so try
rnorm(n, mean = ifelse(x2 == 0, 22, 24), sd = 10)
It might be worth generating factors rather than numeric values for your categorical variables, e.g.
x2 <- factor(sample(c("male", "female"), size = n,
replace = TRUE, prob = c(0.6, 0.4))
That way you don't have to spend as much brainpower keeping track of whether (e.g.) 0 means male or female ...
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
