'pandas Series.str.contains extracted what shouldn't have appeared
I have a data like this
I'm extracting rows with 1.1.1.1 by Series.str.contains(pttn, regex=False)
pttn = '1.1.1.1'
dd = pd.Series(['1.1.1.1, 2.22.3.107','1.1.1.1','2.2.2.2, 1.1.1.1', '2.2.2.2, 1.1.1.14','1.1.1.15','1.1.1.100','1.1.1.101','1.1.1.109'])
dd[dd.str.contains(pttn, regex=False, na=False)]
and I got the unexpected result
0 1.1.1.1, 2.22.3.107
1 1.1.1.1
2 2.2.2.2, 1.1.1.1
3 2.2.2.2, 1.1.1.14
4 1.1.1.15
5 1.1.1.100
6 1.1.1.101
7 1.1.1.109
dtype: object
but actually what I want only is
0 1.1.1.1, 2.22.3.107
1 1.1.1.1
2 2.2.2.2, 1.1.1.1
dtype: object
Solution 1:[1]
Update
>>> dd[dd.str.split(', ').explode().loc[lambda x: x == pttn].index]
0 1.1.1.1, 2.22.3.107
1 1.1.1.1
2 2.2.2.2, 1.1.1.1
dtype: object
Old answer
You are looking for str.fullmatch:
>>> dd[dd.str.fullmatch(pttn)]
0 1.1.1.1
dtype: object
Or
>>> df[dd == pttn]
0 1.1.1.1
dtype: object
The advantage with str.fullmatch is you can use a regular expression or control the case, sensitive or not.
Solution 2:[2]
Simply use
newdd = dd[dd == pttn]
Your solution uses contains, and indeed, all values in dd contain the string '1.1.1.1', so they all match.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | 9769953 |
