'Extract possible sample combinations from multiple count constraints
I have some input data like this.
| unique ID | Q1 | Q2 | Q3 |
|---|---|---|---|
| 1 | 1 | 1 | 2 |
| 2 | 1 | 1 | 2 |
| 3 | 1 | 0 | 3 |
| 4 | 2 | 0 | 1 |
| 5 | 3 | 1 | 2 |
| 6 | 4 | 1 | 3 |
And my target is to extract some data which satisfy the following conditions:
- total count: 4
Q1=1count: 2Q1=2count: 1Q2=1count: 1~3Q3=1count: 1
In this case, both data set with ids [1, 2, 4, 5] or [2, 3, 4, 5] are acceptable answers.
In reality, I will possibly have 6000+ rows of data and up to 12 count limitation like above. The count might varies from 1 to 50. I've written a solution which firstly group all ids by each condition, then use deapth first search to exhaustedly try out all possible combinations between the groups. (I believe this is a brute-force solution...) However, I always run out my computer's memory and my time before I can get a possible answer.
My question is,
- what's the possible least time complexity of this problem. (I believe this is kind of subset sum problem, but I am not sure)
- how can I solve this problem instead of a brute-force one? I'm considering dynamic programming or decision tree. However, I believe that I will possibly run out of my computer's memory with either of this one. Or can I solve this problem by each data row's probabilities/entropy (and I would appreciate more details on this)?
My brute-force solution sample codes are not worth reading at all. Thus, I'll skip posting my code snippets...
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
