'Pandas: for each row, get the matching column
I have a dataframe like this,
df = pd.DataFrame({'A':[1,-2,3,4,5],'B':[1,2,3,4,-5],'C':[1,2,3,-4,5],'value':[123,1567,1456,764,2456]})
print(df)
is,
A B C value
0 1 1 1 123
1 -2 2 2 1567
2 3 3 3 1456
3 4 4 -4 764
4 5 -5 5 2456
I know that there is only one negative value per row. I want to know in which column such a value is. I know I can do it like this,
desiredOutput = []
for i,row in df.iterrows():
if any(row < 0):
print(row[row < 0].index.to_numpy()[0],row[row[row < 0].index.to_numpy()[0]],row.value)
desiredOutput.append(row[row < 0].index.to_numpy()[0])
else:
desiredOutput.append(None)
print(desiredOutput)
gives,
A -2 1567
C -4 764
B -5 2456
[None, 'A', None, 'C', 'B']
But I imagine there must be a pythonnic way to do it (probably using .apply()?)
Solution 1:[1]
Use DataFrame.dot for matrix multiplication:
desiredOutput = df.lt(0).dot(df.columns).mask(lambda x: x.eq(''), None)
print (desiredOutput)
0 None
1 A
2 None
3 C
4 B
dtype: object
desiredOutput = df.lt(0).dot(df.columns).replace('',np.nan)
print (desiredOutput)
0 NaN
1 A
2 NaN
3 C
4 B
dtype: object
With apply it is possible, but slowier if filter columns by mask per rows:
desiredOutput = df.lt(0).apply(lambda x: next(iter(x.index[x]), None), axis=1)
print (desiredOutput)
0 None
1 A
2 None
3 C
4 B
dtype: object
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
