'Getting `A value is trying to be set on a copy of a slice from a DataFrame.` when setting a column
I know a value should not be set on a view of a pandas dataframe and I'm not doing that but I'm getting this error. I have a function like this:
def do_something(df):
# id(df) is xxx240
idx = get_skip_idx(df) # another function that returns a boolean series
if any(idx):
df = df[~idx]
# id(df) is xxx744, df is now a local variable which is a copy of the input argument
assert not df._is_view # This doesn't fail, I'm not having a view
df['date_fixed'] = pd.to_datetime(df['old_date'].str[:10], format='%Y-%m-%d')
# I'm getting the warning here which doesn't make any sense to me
I'm using pandas 1.4.1. This sounds like a bug to me, wanted to confirm I'm not missing anything before filing a ticket.
Solution 1:[1]
My understanding is that _is_view can return false negatives and that you are actually working on a view of the original dataframe.
One workaround is to replace df[~idx] with df[~idx].copy():
import pandas as pd
df = pd.DataFrame(
{
"value": [1, 2, 3],
"old_date": ["2022-04-20 abcd", "2022-04-21 efgh", "2022-04-22 ijkl"],
}
)
def do_something(df, idx):
if any(idx):
df = df[~idx].copy()
df["date_fixed"] = pd.to_datetime(df["old_date"].str[:10], format="%Y-%m-%d")
return df
print(do_something(df, pd.Series({0: True, 1: False, 2: False})))
# No warning
value old_date date_fixed
1 2 2022-04-21 efgh 2022-04-21
2 3 2022-04-22 ijkl 2022-04-22
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Laurent |
