'Information Gain Calculation for 3 Classes
I have a problem that requires us to use information gain for feature selection but the target class has 3 classes (positive, negative, and neutral) instead of two. As such, I am a little confused about how the calculation for info would work.
For 2 class problems, I know that info(p,n) = 0 if all the objects are in 1 class and there are none in the other class. Would this be the same for 3-class problems? How about if 2 classes have objects and the third class does not?
For example, if I have an attribute split as follows, would the following info(p,n,neu) calculations be correct?
The formula I am using for info is -p/p+n+neulog2(p/p+n+neu)-n/p+n+neulog2(n/p+n+neu)-neu/p+n+neulog2(neu/p+n+neu)
| Dept | Positive | Negative | Neutral | Info(p,n,neu) |
|---|---|---|---|---|
| X | 40 | 34 | 0 | 0 |
| Y | 60 | 70 | 70 | 0.3969 |
| Z | 34 | 0 | 36 | 0 |
| A | 68 | 83 | 0 | 0 |
| B | 0 | 50 | 0 | 0 |
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
