'Conditional sampling by group based on sample mean
I am trying to use R to make a bunch of different trivia quizzes. I have a large dataset (quiz_df) containing numerous questions divided into categories and difficulties looking like this:
ID Category Difficulty
1 1 Sports 3
2 2 Science 7
3 3 Low culture 4
4 4 High culture 2
5 5 Geography 8
6 6 Lifestyle 3
7 7 Society 3
8 8 History 5
9 9 Sports 2
10 10 Science 8
... ... ... ...
1000 1000 Science 3
Now I want to randomly sample 3 questions from each of the 8 categories, so that the mean difficulty is 4 (or the sum being 4*24 = 96).
library(dplyr)
set.seed(100)
quiz1 <- quiz_df %>% group_by(Category) %>% sample_n(3)
This creates a random quiz set with 3 questions in each category, but does not take into consideration the difficulty. I am aware of the weight-option in sample_n:
library(dplyr)
set.seed(100)
quiz1 <- quiz_df %>% group_by(Category) %>% sample_n(3, weight = Diffculty)
But this does not solve the issue. Ideally, I would like to add an option like: sum = 96, or mean = 4.
Does anyone have any clues?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
