'How to select values from pandas dataframe by column value
I am doing an analysis of a dataset with 6 classes, zero based. The dataset is many thousands of items long.
I need two dataframes with classes 0 & 1 for the first data set and 3 & 5 for the second.
I can get 0 & 1 together easily enough:
mnist_01 = mnist.loc[mnist['class']<= 1]
However, I am not sure how to get classes 3 & 5... so what I would like to be able to do is:
mnist_35 = mnist.loc[mnist['class'] == (3 or 5)]
...rather than doing:
mnist_3 = mnist.loc[mnist['class'] == 3]
mnist_5 = mnist.loc[mnist['class'] == 5]
mnist_35 = pd.concat([mnist_3,mnist_5],axis=0)
Solution 1:[1]
You can use isin
, probably using set membership to make each check an O(1)
time complexity operation:
mnist = pd.DataFrame({'class': [0, 1, 2, 3, 4, 5],
'val': ['a', 'b', 'c', 'd', 'e', 'f']})
>>> mnist.loc[mnist['class'].isin({3, 5})]
class val
3 3 d
5 5 f
>>> mnist.loc[mnist['class'].isin({0, 1})]
class val
0 0 a
1 1 b
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |