'How can I set the value of a Series at a specific in a chainable style?

I can't figure how to set the value of a Series at a specific index in a chainable style.

For example, say I have the following dataframe:

>>> df = pd.DataFrame({'a': [1,2,3], 'b': [0,0,0]})
>>> df
   a  b
0  1  0
1  2  0
2  3  0

If I want to change all the values of a column in a pipeline, I can use pandas.DataFrame.assign():

>>> df.assign(b=[4,5,6])
   a  b
0  1  4
1  2  5
2  3  6

...and then I can do other stuff with the dataframe on the same line, for example:

>>> df.assign(b=[4,5,6]).mul(100)
     a    b
0  100  400
1  200  500
2  300  600

But I can't do this for an individual value at a specific in a Series.

>>> s = df['a']
>>> s
0    1
1    2
2    3
Name: a, dtype: int64

I can, of course, just use a normal Python assignment operation using =:

>>> s[1] = 9
>>> s
0    1
1    9
2    3
Name: a, dtype: int64

But the problems with that are:

  • It's in-place, so it modifies my existing dataframe
  • Assignment statements using = are not allowed in Python lambda functions

For example, what if I wanted to do this:

>>> df.apply(lambda x: x['b', 0] = 13, axis=1)
  File "<stdin>", line 1
    df.apply(lambda x: x['b', 0] = 13, axis=1)
             ^
SyntaxError: expression cannot contain assignment, perhaps you meant "=="?

(I understand that there are better ways to that particular case, but this is just a made-up example.)

How can I set the value at the specified index of a Series? I would like to be able to just to something like s.set_value(idx, 'my_val') and have it return the modified (copied) Series.



Solution 1:[1]

You can use pandas.Series.where() to return a copy of the column with the item at the specified index.

This is basically like using .loc:

>>> df['b'].where(df['b'].index != 1, 13)
0     0
1    13
2     0
Name: b, dtype: int64

If you have an index that isn't a RangeIndex or that doesn't start from zero, you can call reset_index() before where(), which will be like the above for .loc only mimicking the behavior of .iloc instead:

>>> s = pd.Series({'a': 0, None: 0, True: 0})
>>> s
a       0
NaN     0
True    0
dtype: int64

>>> s.where(s.reset_index().index != 1, 13)
a        0
NaN     13
True     0
dtype: int64

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 richardec