'Why does "in" work for a pandas Series in a list comphrension and not as a logical expression

If I want to loop through values in a Series, I can do that using the in operator

[x for x in pd.Series(['Hello, World!'])]

> ['Hello, World!']

but if I use in to check if Hello, World! is in the Series, it returns False.

'Hello, World!' in pd.Series(['Hello, World!'])

> False

Paradoxically (to the untrained eye), this behavior makes the following list comprehension return empty:

hello_series = pd.Series(['Hello, World!'])

[x for x in hello_series if x in hello_series]

> []

This is Series-specific behavior; it of course works fine with lists:

'Hello, World!' in ['Hello, World!']

> True

Why does in work in one context and not the other with Series, and for what reason(s)?



Solution 1:[1]

I'm not quite sure if you're asking a practical question or a theoretical one. The theoretical answer is that whoever wrote the Panda code made a specific design decision.

  • Python interprets x in thing by calling y.__contains__(x).

  • Python interprets for x in thing: by creating an iterator for thing and then getting items from that iterator until the iterator throws an exception indicating it has run out of items. A thing can either implement __iter__ to be explicit about its iterator, or Python can sometimes infer one (the thing has both a len(thing) and thing[i]).

The fact that both of these constructs has in in syntax obviously indicates that they're related. But their implementations for a specific object can have nothing to do with each other.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Frank Yellin