'Convert the -ve values into 1 and reduce those values from +ve values of same variable in pandas
I have a dataframe
df_in = pd.DataFrame([["A",-2],["B",23],["A",-4],["A",14],["B",12],["A",34],["B",-4],["C",-1],["A",-5],["B",21],["C",4],["B",-6]], columns=['var', 'val'])
var val
A -2
B 23
A -4
A 14
B 12
A 34
B -4
C -1
A -5
B 21
C 4
B -6
I want to convert all -ve values to 1. find the difference between the value and 1, sum it at var level. Divide the sum by no. of +ve value of var and reduce that value from the + ve values of var. Ex: var A has 3 -ve value and 2 +ve values. Covert all -ve values to 1 and find the difference like diff between -2 and 1 is 3, -4 and 1 is 5 -5 and 1 is 6. Sum 3+5+6=14. A has 2 +values. So divide 14 by 2 which is 7. Now reduce 7 from +ve values(23,34) of A. Similarly repeat for other var also either by doing groupby.
Expected Output is:
df_out = pd.DataFrame([["A",1],["B",19],["A",1],["A",7],["B",8],["A",27],["B",1],["C",1],["A",1],["B",17],["C",2],["B",1]], columns=['var', 'val'])
var val
A 1
B 19
A 1
A 7
B 8
A 27
B 1
C 1
A 1
B 17
C 2
B 1
How to do it?
Solution 1:[1]
Use:
#Trues for positive
m = df_in['val'].gt(0)
#substract 1 form right side and aggregate negative values
neg = df_in.loc[~m, 'val'].rsub(1).groupby(df_in['var']).sum()
#count positive values
pos = df_in.loc[m, 'var'].value_counts()
#divide both
diff = neg.div(pos)
#mapping difference with subtract `val` for positive else set 1
df_in['val'] = np.where(m, df_in['val'].sub(df_in['var'].map(diff)), 1)
print (df_in)
var val
0 A 1.0
1 B 19.0
2 A 1.0
3 A 7.0
4 B 8.0
5 A 27.0
6 B 1.0
7 C 1.0
8 A 1.0
9 B 17.0
10 C 2.0
11 B 1.0
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | jezrael |
