'Calculate the ratio of specific columns with the same values as other columns in Pandas
Given a DataFrame df as follows:
id target1 target2 value1 value2 value3
0 1 1 -1 NaN 0.0 NaN
1 2 -1 1 1.0 1.0 1.0
2 3 1 -1 1.0 -1.0 -1.0
3 4 1 -1 1.0 1.0 1.0
4 5 -1 -1 -1.0 1.0 1.0
5 6 -1 -1 1.0 -1.0 -1.0
6 7 -1 1 -1.0 NaN 1.0
Let's say for columns target1 and target2, I hope to know which value columns (value1, value2, value3, value4, etc.) have more same values with them respectively. Please note if value column's values are NaNs, it will not be included in the comparison.
For example, for target1-value1, we find id is 3, 4, 5 and 7, they have same values, so the ratio is 4/6 = 0.666667 (id==1 will not be included since value1 is NaN for that cell). Same logic for other columns.
So my question is how could I get the expected result below with Pandas?
value_cols target1 target2
0 value1 0.666667 0.333333
1 value2 0.333333 0.500000
2 value3 0.333333 0.666667
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
