'Comparing Zscore per values and mark them as NaN if it goes above a certain score
I am trying to work on a requirement where I am computing the Zscore and want to compare with individual values in the rows. If Zscore>1 mark them as NaN for those specific values. I am marking it as NaN, so that I could fill those values by appropriate techniques.
I have the below code:
s={'2014':[1,1,2,2],'2015':[12,22,33,44],'2016':[55,66,77,88],'2017':[2,3,4,5]}
p=pd.DataFrame(data=s)
2014 2015 2016 2017
0 1 12 55 2
1 1 22 66 3
2 2 33 77 4
3 2 44 88 5
I have computed zscore as -
df_zscore = (p - p.mean())/p.std()
2014 2015 2016 2017
0 -0.866025 -1.139879 -1.161895 -1.161895
1 -0.866025 -0.416146 -0.387298 -0.387298
2 0.866025 0.379960 0.387298 0.387298
3 0.866025 1.176065 1.161895 1.161895
If Zscore>1, then the desired output should be like:
2014 2015 2016 2017
0 1 12 55 2
1 1 22 66 3
2 2 33 77 4
3 2 NaN NaN NaN
(They are marked as NaN, since Zscore was >1)
How would I be able to get here?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
