'I want to create a synthetic dataframe with two variables with repeat observations
I have to create a synthetic dataset with multiple variables and >50 observations. I have selected to create a synthetic data for an oil field which has 10 wells and five producing reservoirs. So my dataframe would have 3 variables - "Well ID","Reservoir Name" and "Reservoir Quality".
So, I want to create a dataframe in which for each well, I would have 5 reservoirs, and for each reservoir, I would have 3 rock qualities - "Sand","Shale", and "Cement".
I tried for 2 variables in a crude way -
well1 <- data.frame(Wells = rep(1, 5), Reservoirs = c("A", "B", "C", "D","E"))
well2 <- data.frame(Wells = rep(2, 5), Reservoirs = c("A", "B", "C", "D","E"))
.
.
static_data <- rbind(well1,well2,...)
Now, I am struggling how to add the 3rd variable, and is there any smarter way of doing this? I
I am looking for something like this -
| Well | Reservoir | Rock Quality |
|---|---|---|
| 1 | A | Sand |
| 1 | A | Shale |
| 1 | A | Cement |
| 1 | B | Sand |
| 1 | B | Shale |
| 1 | B | Cement |
Solution 1:[1]
The package data.table has a cross-join function that gives what I think you need.
library(data.table)
CJ(a=c(1,2,3), b=c('a', 'b'), c=c('Y', 'Z'))
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Daniel Warner |
