'A value is trying to be set on a copy of a slice from a DataFrame. - pandas
I'm new to pandas, and, given a data frame, I was trying to drop some columns that don't accomplish an specific requirement. Researching how to do it, I got to this structure:
df = df.loc[df['DS_FAMILIA_PROD'].isin(['CARTOES', 'CARTÕES'])]
However, when processing the frame, I get this error:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
self[name] = value
I'm not sure about what to do because I'm already using the .loc function.
What am I missing?
f = ['ID_manifest', 'issue_date', 'channel', 'product', 'ID_client', 'desc_manifest']
df = pd.DataFrame(columns=f)
for chunk in df2017_chunks:
aux = preProcess(chunk, f)
df = pd.concat([df, aux])
def preProcess(df, f):
stops = list(stopwords.words("portuguese"))
stops.extend(['reclama', 'cliente', 'santander', 'cartao', 'cartão'])
df = df.loc[df['DS_FAMILIA_PROD'].isin(['CARTOES', 'CARTÕES'])]
df.columns = f
df.desc_manifest = df.desc_manifest.str.lower() # All lower case
df.desc_manifest = df.desc_manifest.apply(lambda x: re.sub('[^A-zÀ-ÿ]', ' ', str(x))) # Just letters
df.replace(['NaN', 'nan'], np.nan, inplace = True) # Remone nan
df.dropna(subset=['desc_manifest'], inplace=True)
df.desc_manifest = df.desc_manifest.apply(lambda x: [word for word in str(x).split() if word not in stops]) # Remove stop words
return df
Solution 1:[1]
The purpose of the warning is to show users that they may be operating on a copy and not the original but there can be False positives. As mentioned in the comments, this is not an issue for your use case.
You can simply turn off the check for your dataframe:
df.is_copy = False
or you can explicitly copy:
df = df.loc[df['DS_FAMILIA_PROD'].isin(['CARTOES', 'CARTÕES'])].copy()
Solution 2:[2]
pd.set_option('mode.chained_assignment', 'warn')
# if you set a value on a copy, warning will show
df = DataFrame({'DS_FAMILIA_PROD' : [1, 2, 3], 'COL2' : [5, 6, 7]})
df = df[df.DS_FAMILIA_PROD.isin([1, 2])]
df
Out[29]:
COL2 DS_FAMILIA_PROD
0 5 1
1 6 2
Solution 3:[3]
If your program intends to take a copy of the df on purpose, you can stop the warning with this:
pd.set_option('mode.chained_assignment', None)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | A.Kot |
| Solution 2 | |
| Solution 3 | Carlo Antonio Fernandez Benede |
